The challenging aspect of building neuromorphic circuits in mature CMOS technology to match brain-like architectures is two-fold: scalability and connectivity. Scalability means that the circuits have to be expandable...
详细信息
ISBN:
(纸本)9781450318679
The challenging aspect of building neuromorphic circuits in mature CMOS technology to match brain-like architectures is two-fold: scalability and connectivity. Scalability means that the circuits have to be expandable to match biological brains in terms of synaptic and neuronal densities. The challenge here is to implement 10~6 neurons and 10~(10) synapses with an average fanout of 10~4, in a square cm of CMOS. Connectivity means that the circuit has to offer the capability to have both short and long range (by physical distance) connections between neurons. A large part of this challenge is how to implement a connectivity of 10~4 synapses per neuron. Unfortunately, even the exponential transistor density growth being experienced today is not sufficient to realize such massive connectivity and synaptic densities in a traditional CMOS process. Recent approaches to address these challenges have been to integrate CMOS with nanotechnology in order to achieve the required synaptic densities. These solutions use crossbar architectures predominantly but the connectivity challenge still remains a daunting task for such solutions. To meet these challenges, a novel synaptic time-multiplexing (STM) concept was developed along with a neural fabric design. This combination has the advantage of offering greater flexibility and long range connectivity. It also provides a method to overcome the limitations of conventional CMOS technology to match the synaptic density and connectivity requirements found in mammalian brains while maintaining nonlinear synapses and learning. In order to program neuromorphic hardware for any desired brain architecture, the topology would first have to be converted into a connectivity matrix or a graph representation. This matrix along with the statistics on the number of neurons and synapses is provided as input to a neuromorphic compiler. The neuromorphic compiler compiles the neural network structure description into: 1) an assignment of the network'
Deep learning systems have been successfully applied to Euclidean data such as images, video, and audio. In many applications, however, information and their relationships are better expressed with graphs. Graph Convo...
详细信息
ISBN:
(数字)9781728173832
ISBN:
(纸本)9781728173849
Deep learning systems have been successfully applied to Euclidean data such as images, video, and audio. In many applications, however, information and their relationships are better expressed with graphs. Graph Convolutional Networks (GCNs) appear to be a promising approach to efficiently learn from graph data structures, having shown advantages in many critical applications. As with other deep learning modalities, hardware acceleration is critical. The challenge is that real-world graphs are often extremely large and unbalanced; this poses significant performance demands and design challenges. In this paper, we propose Autotuning-Workload-Balancing GCN (AWB-GCN) to accelerate GCN inference. To address the issue of workload imbalance in processing real-world graphs, three hardware-based autotuning techniques are proposed: dynamic distribution smoothing, remote switching, and row remapping. In particular, AWB-GCN continuously monitors the sparse graph pattern, dynamically adjusts the workload distribution among a large number of processing elements (up to 4K PEs), and, after converging, reuses the ideal configuration. Evaluation is performed using an Intel D5005 FPGA with five commonly-used datasets. Results show that 4K-PE AWB-GCN can significantly elevate PE utilization by 7.7× on average and demonstrate considerable performance speedups over CPUs (3255×), GPUs (80.3×), and a prior GCN accelerator (5.1×).
The random number generators (TRNGs) provide unpredictable data for various encryption operations such as information security. In this paper, a novel structure of TRNG is proposed based on the random jitter of a four...
详细信息
With the expansion of cloud services and privacy security becomes more crucial, fully homomorphic encryption (FHE) scheme which operates data in ciphertext domain has been widely concerned. Lattice-based cryptography ...
详细信息
暂无评论