The paper presents a programmable (using a 1-bit signal) digital gate that can operate in one of two OR or AND modes. A circuit of this type can also be implemented using conventional logic gates. However, in the case...
The paper presents a programmable (using a 1-bit signal) digital gate that can operate in one of two OR or AND modes. A circuit of this type can also be implemented using conventional logic gates. However, in the case of the proposed circuit, compared to conventional solutions, the advantage is a much smaller number of transistors necessary for its implementation. Circuit is also much faster than its conventional counterpart. The logic gate was implemented in the 180nm CMOS technology and verified using Hspice simulations. There are many possible applications for this gate, mostly in artificial intelligence and pattern recognition algorithms implemented at the transistor level.
The development of Machine Learning and IoT technology requires fast processing. RISC-V is an open-source reduced instruction set-based instruction set architecture, and the processor based on this architecture can be...
详细信息
The development of Machine Learning and IoT technology requires fast processing. RISC-V is an open-source reduced instruction set-based instruction set architecture, and the processor based on this architecture can be modified accordingly. The base integer instruction extension supports the operating system environment and is also suitable for embedded systems. It is a 32-bit instruction extension and is defined as RV32I. In this paper, we propose a 32-bit integer instruction-based RISC-V processor core. The proposed core has a five-stage pipeline, including the optimized arithmetic and logic unit. The instruction fetch stage is merged with the pre-fetch stage dynamic branch prediction into a two-stage pipeline. The processor is implemented using Verilog HDL, and the resource utilization is verified for FPGA. The results show that the proposed module performs 30% better than the best-performing processor (considering operating frequency) and showed a 17.6% improvement in the proposed core.
Convolutional neural networks (CNNs) have achieved significant accuracy improvement in many intelligent applications at the cost of intensive convolution operations and massive data movements. To efficiently deploy CN...
详细信息
Convolutional neural networks (CNNs) have achieved significant accuracy improvement in many intelligent applications at the cost of intensive convolution operations and massive data movements. To efficiently deploy CNNs on low power embedded platforms in real time, the depthwise separable convolution has been proposed to replace the standard convolution, especially in lightweight CNNs, which remarkably reduces the computation complexity and model size. However, it is difficult for a general convolution engine to obtain the theoretical performance improvement as the decreased data dependency of depthwise convolution significantly reduces the data reuse opportunity. To address this issue, a flexible and highperformance accelerator based on FPGA is proposed to efficiently process the inference of both large-scale and lightweight CNNs. Firstly, by sharing the activation dataflow between the depthwise convolution and pooling layers, the control logic and data bus of the two layers are reused to maximize the data utilization and minimize the logic overhead. Furthermore, these two layers can be processed either directly after standard convolutions to eliminate the external memory accesses or independently to gain better flexibility. Thirdly, a performance model is proposed to automatically explore the optimal design options of the accelerator. The proposed hardware accelerator is evaluated on Intel Arria 10 SoC FPGA and demonstrates state-of-the-art performance on both large-scale CNNs, e.g., VGG, and lightweight ones, e.g., MobileNet.
The emerging neural-silicon interface devices bridge nerve systems with artificial systems and play a key role in neuro-prostheses and neuro-rehabilitation applications. Integrating neural signal collection, processin...
详细信息
ISBN:
(纸本)9781424441228
The emerging neural-silicon interface devices bridge nerve systems with artificial systems and play a key role in neuro-prostheses and neuro-rehabilitation applications. Integrating neural signal collection, processing and trans-mission on a single device will make clinical applications more practical and feasible. This paper focuses on the wireless antenna part and real-time neural signal analysis part of implantable brain-machine interface (BMI) devices. We propose to use millimeter-wave for wireless connections between different areas of a brain. Various antenna, including microstrip patch, monopole antenna and substrate integrated waveguide antenna are considered for the intra-cortical proximity communication. A Hebbian eigenfilter based method is proposed for multi-channel neuronal spike sorting. Folding and parallel design techniques are employed to explore various structures and make a trade-off between area and power consumption. fieldprogrammablelogic arrays (FPGAs) are used to evaluate various structures.
暂无评论