Application-driven computers for Lattice Gauge Theory simulations have often been based on system-on-chip designs, but the development costs can be prohibitive for academic project budgets. An alternative approach use...
详细信息
Application-driven computers for Lattice Gauge Theory simulations have often been based on system-on-chip designs, but the development costs can be prohibitive for academic project budgets. An alternative approach uses compute nodes based on a commercial processor tightly coupled to a custom-designed network processor. Preliminary analysis shows that this solution offers good performance, but it also entails several challenges, including those arising from the processor's multicore structure and from implementing the network processor on a field-programmablegatearray.
Authenticated ciphers are cryptographic transformations which combine the functionality of confidentiality, integrity, and authentication. This research uses register transfer-level (RTL) design to describe selected a...
详细信息
Authenticated ciphers are cryptographic transformations which combine the functionality of confidentiality, integrity, and authentication. This research uses register transfer-level (RTL) design to describe selected authenticated ciphers using a hardware description language (HDL), verifies their proper operation through functional simulation, and implements them on target FPGAs. The authenticated ciphers chosen for this research are the CAESAR Round Two variants of SCREAM, POET, Minalpher, and OMD. Ciphers are discussed from an engineering standpoint, and are compared and contrasted in terms of design features. To ensure conformity and standardization in evaluation, all four candidates are implemented with an identical version of the CAESAR Hardware API for authenticated ciphers. Functionally correct implementations of all four ciphers are realized, and results are compared against each other and previous results in terms of throughput, area, and throughput-to-area (TP/A) ratio. SCREAM is found to have the highest TP/A ratio of these four ciphers in the Virtex-6 FPGA, while Minalpher has the highest TP/A ratio in the Virtex-7 FPGA. (C) 2017 Elsevier B.V. All rights reserved.
This paper presents an efficient parallel architecture for implementation of a constant modulus algorithm (CMA) adaptive array antenna. By inserting delay units into the original CMA, a novel delayed CMA (DCMA) that c...
详细信息
This paper presents an efficient parallel architecture for implementation of a constant modulus algorithm (CMA) adaptive array antenna. By inserting delay units into the original CMA, a novel delayed CMA (DCMA) that can significantly reduce the associated critical path is derived. Consequently, a pipelining architecture that supports parallel processing is introduced for implementation of the DCMA. In addition to the pipelining technique, a power-of-two multiplier is proposed for the DCMA leading to the efficient FPGA implementation. The effects of delays and finite word-length on the convergence property of CMA are investigated via simulations. Moreover, the synthesized results demonstrate that FPGA implementation of the proposed architecture using power-of-two arithmetic achieves 26.9% resource reduction in comparison with that of fixed-point arithmetic and operating clock frequency higher than 65 MHz. The implemented FPGA was tested to confirm that the designed architecture operates well for CMA.
A novel low-complexity combined resampling, retiming and equalizing (RRE) algorithm is proposed. The RRE algorithm uses a single FIR filter for resampling, retiming and equalizing and thus lower the complexity. In the...
详细信息
A novel low-complexity combined resampling, retiming and equalizing (RRE) algorithm is proposed. The RRE algorithm uses a single FIR filter for resampling, retiming and equalizing and thus lower the complexity. In the numerical simulation, with an oversampling rate of 32/27, compared to the traditional time-domain scheme with a 15-tap CMA equalizer and the frequency-domain scheme based on 256-point FFT, the RRE algorithm with a 15-tap RRE filter lowers the error vector magnitude (EVM) by 0.036 dB and 0.043 dB and the complexity is lowered by 48.3% and 31.9%, respectively. In the offline experiment, with a received optical power of -35 dBm, compared to the traditional time-domain scheme with a 15-tap CMA equalizer and the frequency-domain scheme based on 256-point FFT, the RRE algorithm with a 15-tap RRE filter lowers the EVM by 0.26 dB and 0.36 dB. And the RRE algorithm respectively lowers the complexity by 48.3% and 31.9%. The RRE algorithm also enables a real-time 106.24 Gbps (26.56 GBaud) DP-QPSK coherent optical receiver based on a single FPGA chip using four 6-bit ADCs with a sampling rate of similar to 31.48 GSa/s. The FPGA-based receiver achieves a sensitivity of -34 dBm at BER of 1E-3. As far as we know, this is the highest reported bit rate of a coherent receiver based on a single FPGA chip.
In this paper, a novel design of Paillier encryption with a modified polar encoding is proposed and analyzed. A new cross-partitioned add shift processing element based on perfect reconstruction technique is designed ...
详细信息
In this paper, a novel design of Paillier encryption with a modified polar encoding is proposed and analyzed. A new cross-partitioned add shift processing element based on perfect reconstruction technique is designed for the realization of encryption with proper distribution of adders and shifters to minimize the logical component and register usage. In addition, a modified architecture for the polar encoder with optimal delay/minimized hardware resource is achieved by presenting a novel delay calculation methodology followed by register allocation grouped as the Reduced Register Delay Allocation (RRDA) algorithm. Resource utilization, including slice registers, lookuptables (LUTs) and DSP blocks are measured along with operating speed and throughput for the proposed paillier encryption and polar encoder. Finally, the performance is analyzed with the existing designs. The proposed sequential encoding-encryption can be deployed in imminent 5G systems. (C) 2017 Elsevier Ltd. All rights reserved.
We present a new technique for testing field programmable gate arrays (FPGA's) based on look-up tables (LUT's). We consider a generalized structure for the basic FPGA logic element (cell);it includes devices s...
详细信息
We present a new technique for testing field programmable gate arrays (FPGA's) based on look-up tables (LUT's). We consider a generalized structure for the basic FPGA logic element (cell);it includes devices such as LUT's, sequential elements (flip-flops), multiplexers and control circuitry. We use a hybrid fault model for these devices. The model is based on a physical as well as a behavioral characterization. This permits detection of all single faults (either stuck-at or functional) and some multiple faults using repeated FPGA reprogramming. We show that different arrangements of disjoint one-dimensional (1-D) cell arrays with cascaded horizontal connections and common vertical input lines provide a good logic testing regimen. The testing time is independent of the number of cells in the array (C-testability), We define new conditions for C-testability of programmable/reconfigurable arrays. These conditions do not suffer from limited I/O pins. Cell configuration affects the controllability/observability of the iterative array. We apply the approach to various Xilinx FPGA families and compare it to prior work.
作者:
Lovelly, Tyler M.George, Alan D.Univ Florida
Dept Elect & Comp Engn NSF Ctr High Performance Reconfigurable Comp 330 Benton Hall Gainesville FL 32611 USA Univ Florida
Dept Elect & Comp Engn NSF Ctr High Performance Reconfigurable Comp 327 Larsen Hall Gainesville FL 32611 USA
Due to harsh and inaccessible operating environments, space computing presents many unique challenges and constraints that limit onboard computing performance. However, the increasing need for real-time sensor and aut...
详细信息
Due to harsh and inaccessible operating environments, space computing presents many unique challenges and constraints that limit onboard computing performance. However, the increasing need for real-time sensor and autonomous processing, coupled with limited communication bandwidth with ground stations, is increasing onboard computing demands for next-generation space missions. Because currently available space-grade processors cannot satisfy this growing demand, research into various processors is conducted to ensure that potential new processors are based upon architectures that will best meet the computing needs of space missions. Device metrics are used to measure and compare the theoretical capabilities of processors based upon vendor-provided data and tools, enabling the study of large and diverse sets of architectures. Architectural tradeoffs are determined that can be considered when comparing or designing space-grade processors. Results demonstrate how onboard computing capabilities are increasing due to emerging architectures that support high levels of parallelism in terms of computational units, internal memories, and input/output resources;and that performance varies between applications, depending on the compute-intensive kernels used. Furthermore, the overheads incurred by radiation hardening are quantified and used to analyze low-power commercial-off-the-shelf processors for potential hardening and use in future space missions.
A real-time, VanderLugt-type optical correlator using a single SLM is developed. A field programmable gate array is used to capture and process images obtained from a CCD camera at a rate of 60 video fields/s. During ...
详细信息
A real-time, VanderLugt-type optical correlator using a single SLM is developed. A field programmable gate array is used to capture and process images obtained from a CCD camera at a rate of 60 video fields/s. During both enrollment and verification, a finger slides over a glass prism and is input to the system via the frustration of the total internal reflection process. An autoenrollment procedure captures the optimal image during each slide. An optimal composite filter is implemented. The correlation detection process comprises real-time tracking of the correlation peak while the finger is sliding and a decision process based on projective decision boundaries. Real-life tests yielded a false rejection rate of 1% and a false acceptance rate of 0.2%. (C) 1999 Society of Photo-Optical Instrumentation Engineers. [S0091-3286(99)00901-0].
This paper presents a hardware realization of a genetic algorithm (GA) for the path planning problem of mobile robots on a field programmable gate array (FPGA). A customized GA intellectual property (IP) core was desi...
详细信息
This paper presents a hardware realization of a genetic algorithm (GA) for the path planning problem of mobile robots on a field programmable gate array (FPGA). A customized GA intellectual property (IP) core was designed and implemented on an FPGA. A Xilinx xupv5-1x110t FPGA device was used as the hardware platform. The proposed GA IP core was applied to a Pioneer 3-DX mobile robot to confirm its path planning performance. For localization tasks, a camera mounted on the ceiling of the laboratory was utilized to receive images and allow the robot to determine its own location and the obstacles in the environment. In this way, procedures of path planning were tested in a real laboratory environment. An impressive time speedup was achieved when compared with its software implementation. Experimental results illustrate the effectiveness of the GA IP core hardware.
暂无评论