Graphs-based neural networks have seen tremendous adoption to perform complex predictive analytics on massive real-world graphs. The trend in hardware acceleration has identified significant challenges with harnessing...
详细信息
Graphs-based neural networks have seen tremendous adoption to perform complex predictive analytics on massive real-world graphs. The trend in hardware acceleration has identified significant challenges with harnessing graph locality and workload imbalance due to ultra-sparse and irregular matrix computations at a massively parallel scale. State-of-the-art hardware accelerators utilize massive multithreading and asynchronous execution in GPUs to achieve parallel performance at high power consumption. This paper aims to bridge the power-performance gap using the energy efficiency-centric RISC-V ecosystem. A 1000-core RISC-V processor is proposed to unlock massive parallelism in the graphs-based matrix operators to achieve a low-latency data access paradigm in hardware to achieve robust power-performance scaling. Each core implements a single-threaded pipeline with a novel graph-aware data prefetcher at the 1000 cores scale to deliver an average 20x performance per watt advantage over state-of-the-art NVIDIA GPU.
Modern dynamical systems are rapidly incorporating artificial intelligence to improve the efficiency and quality of complex predictive analytics. To efficiently operate on increasingly large datasets and intrinsically...
详细信息
Modern dynamical systems are rapidly incorporating artificial intelligence to improve the efficiency and quality of complex predictive analytics. To efficiently operate on increasingly large datasets and intrinsically dynamic non-euclidean data structures, the computing community has turned to Graph Neural Networks (GNNs). We make a key observation that existing GNN processing frameworks do not efficiently handle the intrinsic dynamics in modern GNNs. The dynamic processing of GNN operates on the complete static graph at each time step, leading to repetitive redundant computations that introduce tremendous under-utilization of system resources. We propose a novel dynamic graph neural network (DGNN) processing framework that captures the dynamically evolving dataflow of the GNN semantics, i.e., graph embeddings and sparse connections between graph nodes. The framework identifies intrinsic redundancies in node-connections and captures representative node-sparse graph information that is readily ingested for processing by the system. Our evaluation on an NVIDIA GPU shows up to 3.5x speedup over the baseline setup that processes all nodes at each time step.
暂无评论