Real-time calibration for a beamforming receiver is a popular topic. We designed and implemented a 4-antenna Rx beamforming system with real-time calibration. This system sends an out-of-band calibration signal from t...
详细信息
Hardware-aware Neural Architecture Search (NAS) and mapping & scheduling optimization methods are being used to find efficient implementations of computationally-intense language models such as BERT. This requires...
详细信息
ISBN:
(数字)9781665485241
ISBN:
(纸本)9781665485241
Hardware-aware Neural Architecture Search (NAS) and mapping & scheduling optimization methods are being used to find efficient implementations of computationally-intense language models such as BERT. This requires measuring real hardware inference latency: good design decisions simply cannot be made with proxy metrics such as FLOPs or the number of parameters. However, the time required to perform on-device latency measurements is prohibitive (e.g., a few days to a few weeks over the course of an optimization run). To address this, we present BERTPerf, a low-cost, highly-accurate method to predict the inference time of BERT on ARM *** multi-core processors. BERTPerf exploits latency patterns at the layer-level to reduce on-device latency measurements, and captures the effect of caching and intermediate tensor allocations to reduce latency prediction error. BERTPerf reduces the maximum prediction error by 7-11% compared to the state-of-the-art, and requires 75% less on-device measurements compared to existing work at the same prediction error.
In this work the design and implementation of a fully integrated ISO 11784/5 RFID reader front-end ASIC is presented, for both HDX and FDX transponder types. The ASIC was designed in a 0.18 μm. CMOS-HV technology, an...
详细信息
This paper introduces a synthesizable μ-architectural design method to boost the performance of a given RISC-V processor architecture by utilizing Canonical Signed Digit (CSD) representation during the execution stag...
详细信息
Division is one of the most commonly sort after algorithm for performing image processing operations such as normalization, filtering, enhancement, deconvolution etc. Hence, the design of efficient division algorithm ...
详细信息
The following document presents an exploration of the design and implementation of floating-point Finite Impulse Response (FIR) filters. The design process involves the selection of appropriate filter specifications, ...
详细信息
Higher spectral and energy efficiencies are the envisioned defining characteristics of high data-rate sixth-generation (6G) wireless networks. One of the enabling technologies to meet these requirements is index modul...
详细信息
Higher spectral and energy efficiencies are the envisioned defining characteristics of high data-rate sixth-generation (6G) wireless networks. One of the enabling technologies to meet these requirements is index modulation (IM), which transmits information through permutations of indices of spatial, frequency, or temporal media. In this paper, we propose novel electromagnetics-compliant designs of reconfigurable intelligent surface (RIS) apertures for realizing IM in 6G transceivers. We consider RIS modeling and implementation of spatial and subcarrier IMs, including beam steering, spatial multiplexing, and phase modulation capabilities. Numerical experiments for our proposed implementations show that the bit error rates obtained via RIS-aided IM outperform traditional implementations. We further establish the programmability of these transceivers to vary the reflection phase and generate frequency harmonics for IM through full-wave electromagnetic analyses of a specific reflect-array metasurface implementation.
Brain-computer interface (BCI) is a system that may benefit people with severe motor disabilities by allowing them to communicate using their brain's signals. However, trends in BCI implementation use large and he...
详细信息
ISBN:
(纸本)9798350318340
Brain-computer interface (BCI) is a system that may benefit people with severe motor disabilities by allowing them to communicate using their brain's signals. However, trends in BCI implementation use large and heavy platforms, such as personal computers (PCs), which limit full integration with portable devices. Due to its parallelism, reconfigurable features, and capabilities to perform multiple channel processing, the Field-Programmable Gate Array (FPGA) platform is suitable for electroencephalography (EEG) signalprocessing. This paper presents the design and implementation of an FPGA-based BCI embedded system for eye state classification in real-time. The system was implemented using Xilinx Artix-7 family FPGA. The designed system filtered EEG signals using FIR filters and the pattern features were calculated using Power Spectral Density (PSD). Furthermore, Linear Discriminant Analysis (LDA) was used to classify EEG data related to the eye state. The proposed system was tested using recorded data from a subject acquired by the open-source biosensing board Cyton for offline and online evaluation. The system achieved an accuracy of 81.1% during real-time sessions. Finally, the results show the execution time, resources, and power consumption of the designed system.
Digital channelizers (DCs) based on the Discrete Fourier Transform (DFT) and polyphase filter banks are widely used in on-board processing (OBP) platforms to extract narrowband sub-channels from a wideband signal effi...
详细信息
Digital channelizers (DCs) based on the Discrete Fourier Transform (DFT) and polyphase filter banks are widely used in on-board processing (OBP) platforms to extract narrowband sub-channels from a wideband signal efficiently. In high-capacity communication satellite platforms there are always multiple DCs extracting narrowband signals from multiple wideband signals in parallel. Field-programmable gate arrays (FPGAs) are a popular option for the implementation of DCs due to their parallel computing capabilities and good re-configurability, but FPGAs suffer single-event upsets (SEUs) on the space platform. This paper focuses on the efficient protection of parallel DCs with enhanced coding techniques. We first prove that a linear relationship between parallel DCs can be introduced and maintained among the multiple outputs. However, traditional coding schemes cannot be directly applied for the detection of faulty DCs due to the quantization noise introduced by fixed point implementations. To address this issue, we propose an enhanced coding scheme by averaging in the space and time domains to minimize the effect of quantization noise, introducing thresholds and a majority voter to further improve the detection probability. Both theoretical analysis and fault injection experiments prove the effectiveness of the proposed protection scheme. Experimental results show that all the SEUs that cause an SNR lower than 20dB can be detected and recovered, and the resource overheads are about 1.6 times and 1.3 times of that of the unprotected DCs for systems with 8 DCs and 16 DCs, respectively.
Data processing processors use the Tree adder as a basic building block for fast arithmetic operations. As the scale of integration develops, more and more signalprocessingsystems are being built on VLSI chips, whic...
详细信息
暂无评论