Hypergraph Neural Networks (HGNNs) are increasingly utilized to analyze complex inter-entity relationships. Traditional HGNN systems, based on a hyperedge-centric dataflow model, independently process aggregation task...
详细信息
Graph neural networks (GNNs) have seen widespread usage across multiple real-world applications, yet in transductive learning, they still face challenges in accuracy, efficiency, and scalability, due to the extensive ...
详细信息
Powered by the massive data generated by the blossom of mobile and Web-of-Things (WoT) devices, Deep Neural Networks (DNNs) have developed both in accuracy and size in recent years. Conventional cloud-based DNN traini...
详细信息
Genome graphs analysis has emerged as an effective means to enable mapping DNA fragments (known as reads) to the reference genome. It replaces the traditional linear reference with a graph-based representation to augm...
ISBN:
(纸本)9798350323481
Genome graphs analysis has emerged as an effective means to enable mapping DNA fragments (known as reads) to the reference genome. It replaces the traditional linear reference with a graph-based representation to augment the genetic variations and diversity information, significantly improving the quality of genotyping. The in-depth characterization of genome graphs analysis uncovers that it is bottlenecked by the irregular seed index access and the intensive alignment operation, stressing both the memory system and computing *** on these observations, we propose MeG2, a lightweight, commodity DRAM-compliant, processing-in-memory architecture to accelerate genome graphs analysis. MeG2 is specifically integrated with the capabilities of both near-memory processing and bitwise in-situ computation. Specifically, MeG2 leverages the low access latency of near-memory processing with the index-centric offload mechanism to alleviate the irregular memory access in the seeding procedure, and harnesses the row-parallel capacity of in-situ computation with the distance-aware technique to exploit the intensive computational parallelism in the alignment process. Results show that MeG2 outperforms the CPU-, GPU-, and ASIC-based genome graphs analysis solutions by 502× (30.2×), 272× (15.1×), and 5.5× (8.3×) for short (long) reads, while reducing energy consumption by 1628× (85.6×), 1443× (77.1×), and 7.8× (11.7×), respectively. We also demonstrate that MeG2 offers significant improvements over existing PIM-based genome sequence analysis accelerators.
Recent progress regarding the use of language models (LMs) as knowledge bases (KBs) has shown that language models can act as structured knowledge bases for storing relational facts. However, most existing works only ...
详细信息
Federated learning (FL) enables massive clients to collaboratively train a global model by aggregating their local updates without disclosing raw data. Communication has become one of the main bottlenecks that prolong...
详细信息
Container based microservices have been widely applied to promote the cloud elasticity. The mainstream Docker containers are structured in layers, which are organized in stack with bottom-up dependency. To start a mic...
详细信息
In the medical realm, the pivotal role of pathological Whole Slide Images (WSIs) in detecting cancer, tracking disease progression, and evaluating treatment efficacy is indisputable. Nevertheless, the identification a...
详细信息
Malware scanning of an app market is expected to be scalable and effective. However, existing approaches use syntax-based features that can be evaded by transformation attacks or semantic-based features which are usua...
详细信息
Temporal graph processing is used to handle the snapshots of the temporal graph, which concerns changes in graph over time. Although several software/hardware solutions have been designed for efficient temporal graph ...
ISBN:
(纸本)9798350323481
Temporal graph processing is used to handle the snapshots of the temporal graph, which concerns changes in graph over time. Although several software/hardware solutions have been designed for efficient temporal graph processing, they still suffer from serious irregular data access due to the uncoordinated graph traversal. To overcome these limitations, this paper proposes SaGraph, a domain-specific hardware accelerator to support the efficient processing of temporal graph. Specifically, temporal graph processing shows strong data access similarity, i.e., most graph accesses of the processing of different snapshots are the same and usually refer to a small fraction of vertices. SaGraph can dynamically coordinate the graph traversals and adaptively cache the vertex states to fully exploit the data access similarity for smaller data access overhead. We implemented and evaluated SaGraph on a Xilinx Alveo U280 FPGA card. Compared with the cutting-edge software and hardware solutions, SaGraph achieves 8.5×-157.3×, 4.2×-16.1× speedups and 34.7×-423.6×, 5.3×-14.7× energy savings, respectively.
暂无评论