检索结果-内蒙古大学图书馆

IEEE COMPUTER ARCHITECTURE LETTERS 2025年第1期24卷 73-76页

作者： Khan, Omer Univ Connecticut Elect & Comp Engn Storrs CT 06269 USA

Graphs-based neural networks have seen tremendous adoption to perform complex predictive analytics on massive real-world graphs. The trend in hardware acceleration has identified significant challenges with harnessing graph locality and workload imbalance due to ultra-sparse and irregular matrix computations at a massively parallel scale. State-of-the-art hardware accelerators utilize massive multithreading and asynchronous execution in GPUs to achieve parallel performance at high power consumption. This paper aims to bridge the power-performance gap using the energy efficiency-centric RISC-V ecosystem. A 1000-core RISC-V processor is proposed to unlock massive parallelism in the graphs-based matrix operators to achieve a low-latency data access paradigm in hardware to achieve robust power-performance scaling. Each core implements a single-threaded pipeline with a novel graph-aware data prefetcher at the 1000 cores scale to deliver an average 20x performance per watt advantage over state-of-the-art NVIDIA GPU.

关键词： Prefetching Graphics processing units Sparse matrices Hardware acceleration Vectors Single instruction multiple data Multithreading Load modeling Training Social networking (online) distributed and scalable parallelism graph neural networks graphs sparse and unstructured matrix operators massively parallel hardware

来源：评论

学校读者我要写书评

暂无评论

Exploiting Intrinsic Redundancies in Dynamic Graph Neural Networks for Processing Efficiency

IEEE COMPUTER ARCHITECTURE LETTERS

引用

IEEE COMPUTER ARCHITECTURE LETTERS 2024年第2期23卷 170-174页

作者： Gurevin, Deniz Ding, Caiwen Khan, Omer Univ Connecticut Dept Elect & Comp Engn Storrs CT 06269 USA Univ Connecticut Dept Comp Sci & Engn Storrs CT 06269 USA

Modern dynamical systems are rapidly incorporating artificial intelligence to improve the efficiency and quality of complex predictive analytics. To efficiently operate on increasingly large datasets and intrinsically dynamic non-euclidean data structures, the computing community has turned to Graph Neural Networks (GNNs). We make a key observation that existing GNN processing frameworks do not efficiently handle the intrinsic dynamics in modern GNNs. The dynamic processing of GNN operates on the complete static graph at each time step, leading to repetitive redundant computations that introduce tremendous under-utilization of system resources. We propose a novel dynamic graph neural network (DGNN) processing framework that captures the dynamically evolving dataflow of the GNN semantics, i.e., graph embeddings and sparse connections between graph nodes. The framework identifies intrinsic redundancies in node-connections and captures representative node-sparse graph information that is readily ingested for processing by the system. Our evaluation on an NVIDIA GPU shows up to 3.5x speedup over the baseline setup that processes all nodes at each time step.

关键词： distributed and scalable parallelism dynamic graphs graph neural networks distributed and scalable parallelism dynamic graphs graph neural networks

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：