Convolutional neural network (CNN) models equipped with depth separable convolution (DSC) promise a lower spatial complexity while retaining high model accuracy. However, little attention has been paid to their hardwa...
详细信息
Convolutional neural network (CNN) models equipped with depth separable convolution (DSC) promise a lower spatial complexity while retaining high model accuracy. However, little attention has been paid to their hardware architecture. Previous studies on DSC-based CNN accelerators typically use fixed computational models for various models, leading to an imbalance between power, efficiency, and performance. To address this problem, a novel, real-time DSC-based CNN accelerator that can accommodate field-programmable gate arrays (fpgas) of different capacities and CNNs of different sizes is proposed in this paper. Attractively, a dynamically reconfigurablecomputing engine and block-convolution-based adaptive dataflow scheduling mode strike a trade-off between hardware resources and the processing speed in industrial processes. The proposed MobileNet accelerator was implemented and evaluated on the Xilinx XC7020 platform. Compared to previous FPGA-based accelerators, the experimental results showed that our approach can provide 10.86 GOPS of computational performance for full HD RGB images, meeting the needs of real industrial real-time applications.
This investigation focuses on a simultaneous wireless information and power transfer (SWIPT) system, significantly enhanced by an active simultaneously transmitting and reflecting reconfigurable intelligent surface (a...
详细信息
The increase of size, capabilities, and speed of fpgas enables the shared usage of reconfigurable resources by multiple applications and even operating systems. While research on FPGA virtualization in HPC-datacenters...
详细信息
ISBN:
(纸本)9781665437592
The increase of size, capabilities, and speed of fpgas enables the shared usage of reconfigurable resources by multiple applications and even operating systems. While research on FPGA virtualization in HPC-datacenters and cloud is already well advanced, it is a rather new concept for embedded systems. The necessity for FPGA virtualization of embedded systems results from the trend to integrate multiple environments into the same hardware platform. As multiple guest operating systems with different requirements, e.g., regarding real-time, security, safety, or reliability share the same resources, the focus of research lies on isolation under the constraint of having minimal impact on the overall system. Drivers for this development are, e.g., computation intensive AI -based applications in the automotive or medical field, embedded 5G edge computing systems, or the consolidation of electronic control units (ECUs) on a centralized MPSoC with the goal to increase reliability by reducing complexity. This survey outlines key concepts of hypervisor-based virtualization of embedded reconfigurable systems. Hypervisor approaches are compared and classified into FPGA-based hypervisors, MPSoC-based hypervisors and hypervisors for distributed embedded reconfigurable systems. Strong points and limitations are pointed out and future trends for virtualization of embedded reconfigurable systems are identified.
Statistics show that in 2030 the number of connected IoT devices will reach 25.44 billion, which can lead to the security breach in the back-end of high-performance computing clusters connected with the same network. ...
详细信息
ISBN:
(数字)9781665459853
ISBN:
(纸本)9781665459853
Statistics show that in 2030 the number of connected IoT devices will reach 25.44 billion, which can lead to the security breach in the back-end of high-performance computing clusters connected with the same network. Unfortunately, the current security primitives are not suitable algorithms to be implemented on physically constrained devices designed for IoT. Thus, the National Institute of Standard and technology has announced a worldwide lightweight cryptographic competition (LWC) for securing tiny devices. This paper introduces a flexible, reconfigurable, and energy-efficient crypto-processor running one of the LWC finalist candidates - ASCON, which uses sponge construction that has fewer memory accesses that leads to less power consumption compared to other ones. The proposed processor is reconfigurable in a way both authenticated cipher (Encryption/decryption processes) and hash functions of ASCON are implemented in a six-mode compact fashion, covering a diversity of applications in the IoT spectrum. The design is developed in Chisel and evaluated in 28/32nm technology with commercial EDA tools. Evaluation results show that the proposed processor achieves the highest throughput while consuming 29% less power, operating at over 667 MHz. The design has also been implemented in Skywater 130nm technology node with the latest released OpenLane design flow to ensure an end-to-end open-source delivery of the IP.
With a pledge to improve the performance per watt, fpgas have earned a place in the world’s leading center for high-performance computing, which has opened new avenues for research. In pursuit of an optimized topolog...
With a pledge to improve the performance per watt, fpgas have earned a place in the world’s leading center for high-performance computing, which has opened new avenues for research. In pursuit of an optimized topology in the context of reconfigurablecomputing, we analyzed six selected 2D and 3D NoC topologies. The results showed that the 3D Torus demonstrated significant throughput as network size, the number of nodes transferring messages, and message size varied. Based on performance analysis and anticipated acceptable resource utilization, we singled out the 3D Torus for further investigation and optimization.
This paper introduces a novel FPGA-based hardware implementation of a pseudorandom word (PSW) generator. Using the FPGA reconfiguration property, the proposed approach allows you to change the algorithms and replace t...
详细信息
Field Programmable Gate Arrays (fpgas) have been targeted as a new accelerator of the HPC field. This is because the barrier to using fpgas has been gradually lowered due to the widespread use of high-level synthesis ...
详细信息
ISBN:
(纸本)9781728196664
Field Programmable Gate Arrays (fpgas) have been targeted as a new accelerator of the HPC field. This is because the barrier to using fpgas has been gradually lowered due to the widespread use of high-level synthesis (HLS) technology. In addition, the bandwidth of external memory in fpgas is much lower than that of other accelerators widely used in HPC, such as NVIDIA V100 GPUs. However, the latest fpgas can use High Bandwidth Memory 2 (HBM2), which has a memory bandwidth of up to 512GB/s. Therefore, we believe fpgas will be a viable option for speeding up applications. However, unlike CPUs and GPUs, fpgas do not have caches and memory networks to exploit the full potential of HBM2, which may limit the efficiency of the application. In this paper, we propose a memory system for HBM2 and HPC applications. We show the prototype implementation of the system and evaluate its performance. We also demonstrate the use of the proposed system from an application developed in High -Level Synthesis (HLS) written in C++.
The paper presents a frequency reconfigurable bowtie antenna designed on silicon carbide (SiC) substrate. A monolithic active antenna is achieved thanks to the co-design method between the active integrated junctions ...
详细信息
This paper examines the critical function of Field-Programmable Gate Arrays (fpgas) in speeding Spiking Neural Networks (SNNs) for real-time edge neuromorphic computing. Our work systematically evaluates the integrati...
详细信息
ISBN:
(数字)9798350387414
ISBN:
(纸本)9798350387421
This paper examines the critical function of Field-Programmable Gate Arrays (fpgas) in speeding Spiking Neural Networks (SNNs) for real-time edge neuromorphic computing. Our work systematically evaluates the integration of FPGA technology for the optimization and speeding of SNN models. The analysis covers the power efficiency, low latency processing, and parallelism that are intrinsic benefits of fpgas, emphasizing their relevance for edge computingapplications. We discuss the smooth transfer of trained SNN models to FPGA platforms. Using an extensive analysis of state-of-the-art architectures, we demonstrate the efficiency benefits of using FPGA to accelerate SNNs. We derive more insights into the real-world applications of this FPGA-SNN integration in various fields. The analysis supports advances in edge computing and neuromorphic processing paradigms by adding to the collective knowledge of how FPGA enhances the real-time processing capabilities of Spiking Neural Networks.
Heterogeneity and reconfigurability have both been adopted by accelerators to improve their flexibility and efficiency for a wide variety of applications, from cloud computing to embedded systems. This paper provides ...
详细信息
ISBN:
(纸本)9798350323481
Heterogeneity and reconfigurability have both been adopted by accelerators to improve their flexibility and efficiency for a wide variety of applications, from cloud computing to embedded systems. This paper provides an overview of the trends of heterogeneous reconfigurable accelerators including field-programmable gate arrays and coarse-grained reconfigurable arrays, and the related design automation approaches for enhancing design quality and designer productivity of these accelerators. We shall also discuss how recent advances in technology, such as multi-level co-design, heterogeneous Function-as-a-Service and meta-programming, would help address the challenges in engineering next-generation heterogeneous reconfigurable accelerators and beyond.
暂无评论