In the modern verification environment an FPGA-based prototyping has become an important part of the whole verification flow. the ability to simulate real time application in more realistic speeds allows much higher c...
详细信息
ISBN:
(纸本)9781467381239
In the modern verification environment an FPGA-based prototyping has become an important part of the whole verification flow. the ability to simulate real time application in more realistic speeds allows much higher coverage than traditional HDL logic simulators. the main disadvantage of FPGA prototyping is inability to inspect and observe internal FPGA signals. Currently there are two traditional solutions for this problem. the first solution is using embedded trace-buffers to record a subset of internal signals and the second solution captures a snapshot of the current FPGA state. Both of these techniques have certain benefits and shortcomings. In this paper, we present an idea of merging these two techniques into a new hybrid approach. Using this idea we created a hybrid circuit and during our experiments showed that it preserves all good sides from both traditional approaches.
fieldprogrammable Gate Arrays (FPGAs) have become increasingly popular in circuit development due to their rapid development times and low costs. Withtheir increased use, the need to protect their Intellectual Prope...
详细信息
ISBN:
(纸本)9781424438914
fieldprogrammable Gate Arrays (FPGAs) have become increasingly popular in circuit development due to their rapid development times and low costs. Withtheir increased use, the need to protect their Intellectual Property (IP) becomes more urgent. the digital fingerprint accomplishes this by creating a unique identification (ID) for each FPGA. In this research, we propose methods to dramatically increase the stability and robustness of the digital fingerprint ID by the proper choice of input sequences. We also show that by properly choosing the input word, we can significantly increase the DF resistance to operating temperature changes.
this paper presents preliminary work exploring adaptive fieldprogrammable gate arrays (AFPGAs). An AFPGA is adaptative in the sense that the functionality of subcircuits placed on the chip can change in response to c...
详细信息
ISBN:
(纸本)9781424403127
this paper presents preliminary work exploring adaptive fieldprogrammable gate arrays (AFPGAs). An AFPGA is adaptative in the sense that the functionality of subcircuits placed on the chip can change in response to changes observed on certain control signals. We describe the high-level architecture which adds additional control logic and SRAM bits to a traditional FPGA to produce an AFPGA. We also describe a synthesis method that identifies and resynthesizes mutually exclusive pieces of logic so that they may share the resources available in an AFPGA. the architectural feature and its associated synthesis method helps reduce circuit size by 28% on average and up to 40% on select circuits.
this paper presents a compact and FPGA based implementation of the AES encryption standard, specifically designed for processing two independent 128-bit input blocks in feedback modes. this configuration is particular...
详细信息
ISBN:
(纸本)9781467381239
this paper presents a compact and FPGA based implementation of the AES encryption standard, specifically designed for processing two independent 128-bit input blocks in feedback modes. this configuration is particularly focused on the Counter with CBC-MAC Protocol, but can also be adapted to other AES based encryption-authentication protocols requiring the processing of two independent data streams. Most of the state of the art solutions implementing CCMP consider large datapaths, sometimes with separated encryption datapaths for the different data streams, leading to low resource efficiency. the work herein proposed suggests that with adequate FPGA component usage and with proper data scheduling a very compact and efficient dual AES core can be derived particularly on FPGAs. Overall, the proposed structure allows for a throughput of 1.7Gbps while achieving a throughput/Slice efficiency of 24.22 Mbps/Slice, 47% higher than the existing related state of the art.
Numerical measures of similarity/distance between objects represented by binary vectors are common in a wide range of disciplines. Searching in large-scale chemical databases requires billions of comparisons between m...
详细信息
ISBN:
(纸本)9781467381239
Numerical measures of similarity/distance between objects represented by binary vectors are common in a wide range of disciplines. Searching in large-scale chemical databases requires billions of comparisons between molecules that are represented by binary fingerprints to capture the atomic structure. the performance bottleneck here is the enumeration of set bits in vectors (population count). Due to the discrete representation, similarity measures between binary fingerprints should fit well to FPGAs. We present an architecture to accelerate binary similarity assessment, evaluate various design points, and compare performance to highly optimized CPU and GPU implementations. We implement an RTL generation software, SimGenRTL, to generate accelerators of various sizes based on the proposed architecture. We find that accelerators with fewer and wider population counters allow better distribution of the hardware resources, outperforming significantly accelerators with more and narrower bit-enumeration components. SimGenRTL is available for download to allow rapid design space exploration of the computational core ahead of a full custom system implementation.
We consider implementing FPGAs using a standard cell design methodology, and present a framework for the automated generation of synthesizable FPGA fabrics. the open-source Verilog-to-Routing (VTR) FPGA architecture e...
详细信息
ISBN:
(纸本)9781467381239
We consider implementing FPGAs using a standard cell design methodology, and present a framework for the automated generation of synthesizable FPGA fabrics. the open-source Verilog-to-Routing (VTR) FPGA architecture evaluation framework [1] is extended to generate synthesizable Verilog for its in-memory FPGA architectural device model. the Verilog can be synthesized into standard cells, placed and routed using an ASIC design flow. A second extension to VTR generates a configuration bitstream for the FPGA;that is, the bitstream configures the FPGA to realize a user-provided placed and routed design. the proposed framework and methodology opens the door to silicon implementation of a wide range of VTR-modelled FPGA fabrics. In an experimental study, area and timing-optimized FPGA implementations in 65 nm TSMC standard cells are compared with a 65 nm Altera commercial FPGA.
the benefits of customising the precision throughout an FPGA design according to a design tolerance are well known. However, customising the precision of a design at run-time has the potential for an even greater perf...
详细信息
ISBN:
(纸本)9789090304281
the benefits of customising the precision throughout an FPGA design according to a design tolerance are well known. However, customising the precision of a design at run-time has the potential for an even greater performance impact. In this paper, we add the ability to dynamically choose the internal precision of a datapath. this enables a result that is at least as accurate as the worst-case under standard precisions, whilst internally operating at a lower precision. We demonstrate this technique on fused floating-point dot-product circuits. We show that for circuits with inputs that have a wide dynamic range, we can see substantial resource savings. We provide examples with savings of up to 75% of the DSPs and 16% of the ALMs over an optimised fused dot-product design.
Microarchitecture optimization for processor design is a must to achieve target system performance. Provided the register transfer level (RTL) model in real chip design, this paper proposes MOFPGA system, which uses f...
详细信息
ISBN:
(纸本)9781467381239
Microarchitecture optimization for processor design is a must to achieve target system performance. Provided the register transfer level (RTL) model in real chip design, this paper proposes MOFPGA system, which uses fieldprogrammable gate array (FPGA) prototyping as an effective method for fine-grain microarchitecture optimization. It is a fast, reconfigurable, and visible platform with zero impact on the performance of the monitored processor. MOFPGA implements a complete computing platform equipped with a modern out-of-order processor and is able to achieve 60 MHz processor frequency. Besides general FPGA implementation techniques such as multi-port SRAM design and gate-clock conversion, extensive optimization efforts are done to improve the FPGA performance of mapping such a large core. To our knowledge, MOFPGA is the first published FPGA system that implements a modern out-of-order processor running at such high frequency and can report the real SPEC CPU2000 evaluation results.
In this paper, we present an FPGA hardware implementation approach for a phylogenetic tree reconstruction with maximum parsimony algorithm. the algorithm, based on stochastic local search, uses the Indirect Calculatio...
详细信息
ISBN:
(纸本)9789090304281
In this paper, we present an FPGA hardware implementation approach for a phylogenetic tree reconstruction with maximum parsimony algorithm. the algorithm, based on stochastic local search, uses the Indirect Calculation of Tree Lengths and the Incremental Tree Optimization methods. We evaluate and compare our new approach against previous hardware approaches, and against TNT, the fastest available parsimony program. We make this comparison for eight real-world biological datasets. We obtain acceleration rates per tree between 2.60 and 4.68 against a previous approach that does not use incremental optimization;between 2.19 and 4.05 against a previous approach that uses an alternative second-pass optimization;and, between 2.66 and 31.94 against TNT. Our approach is not only faster, but it does not use additional memories during the incremental optimization, as it is the case for a software approach.
this study describes the integration of thermally assisted switching magnetic random access memories (TAS-MRAMs) in field-programmable gate array (FPGA) design. the non-volatility is achieved through the use of magnet...
详细信息
this study describes the integration of thermally assisted switching magnetic random access memories (TAS-MRAMs) in field-programmable gate array (FPGA) design. the non-volatility is achieved through the use of magnetic tunnelling junctions (MTJs) in an MRAM cell. A TAS scheme is used to write data in the MTJ device, which helps to reduce power consumption during a write operation in comparison withthe conventional writing scheme used in MTJ devices. Furthermore, the non-volatility allows reducing both power consumption and configuration time required at each power-up of the circuit in comparison with classical static random access memory-based FPGAs. An innovative architecture furthermore provides run-time reconfigurable (RTR) support at minimum area overhead. A RTR FPGA element using TAS-MRAM allows dynamic reconfiguration mechanisms, while featuring simple design architecture.
暂无评论