Graph neural networks(GNNs) have emerged as powerful approaches to learn knowledge about graphs and *** rapid employment of GNNs poses requirements for processing *** to incompatibility of general platforms,dedicated ...
详细信息
Graph neural networks(GNNs) have emerged as powerful approaches to learn knowledge about graphs and *** rapid employment of GNNs poses requirements for processing *** to incompatibility of general platforms,dedicated hardware devices and platforms are developed to efficiently accelerate training and inference of *** conduct a survey on hardware acceleration for *** first include and introduce recent advances of the domain,and then provide a methodology of categorization to classify existing works into three ***,we discuss optimization techniques adopted at different *** finally we propose suggestions on future directions to facilitate further works.
Deep learning has gained superior accuracy on Euclidean structure data in neural *** a result,nonEuclidean structure data,such as graph data,has more sophisticated structural information,which can be applied in neural...
详细信息
Deep learning has gained superior accuracy on Euclidean structure data in neural *** a result,nonEuclidean structure data,such as graph data,has more sophisticated structural information,which can be applied in neural networks as well to address more complex and practical ***,actual graph data obeys a power-law distribution,so the adjacent matrix of a graph is random and *** processing accelerator(GPA)is designed to handle the problems ***,graph computing only processes 1-dimensional *** graph neural networks(GNNs),graph data is ***,GNNs include the execution processes of both traditional graph processing and neural network,which have irregular memory access and regular computation,*** obtain more information in graph data and require better model generalization ability,the layers of GNN are deeper,so the overhead of memory access and computation is *** present,GNN accelerators are designed to deal with this *** this paper,we conduct a systematic survey regarding the design and implementation of GNN ***,we review the challenges faced by GNN accelerators,and existing related works in detail to process ***,we evaluate previous works and propose future directions in this booming field.
This paper investigates the multi-Unmanned Aerial Vehicle(UAV)-assisted wireless-powered Mobile Edge Computing(MEC)system,where UAVs provide computation and powering services to mobile *** aim to maximize the number o...
详细信息
This paper investigates the multi-Unmanned Aerial Vehicle(UAV)-assisted wireless-powered Mobile Edge Computing(MEC)system,where UAVs provide computation and powering services to mobile *** aim to maximize the number of completed computation tasks by jointly optimizing the offloading decisions of all terminals and the trajectory planning of all *** action space of the system is extremely large and grows exponentially with the number of *** this case,single-agent learning will require an overlarge neural network,resulting in insufficient ***,the offloading decisions and trajectory planning are two subproblems performed by different executants,providing an opportunity for *** thus adopt the idea of decomposition and propose a 2-Tiered Multi-agent Soft Actor-Critic(2T-MSAC)algorithm,decomposing a single neural network into multiple small-scale *** the first tier,a single agent is used for offloading decisions,and an online pretrained model based on imitation learning is specially designed to accelerate the training process of this *** the second tier,UAVs utilize multiple agents to plan their *** agent exerts its influence on the parameter update of other agents through actions and rewards,thereby achieving joint *** results demonstrate that the proposed algorithm can be applied to scenarios with various location distributions of terminals,outperforming existing benchmarks that perform well only in specific *** particular,2T-MSAC increases the number of completed tasks by 45.5%in the scenario with uneven terminal ***,the pretrained model based on imitation learning reduces the convergence time of 2T-MSAC by 58.2%.
In high-risk industrial environments like nuclear power plants, precise defect identification and localization are essential for maintaining production stability and safety. However, the complexity of such a harsh env...
详细信息
In high-risk industrial environments like nuclear power plants, precise defect identification and localization are essential for maintaining production stability and safety. However, the complexity of such a harsh environment leads to significant variations in the shape and size of the defects. To address this challenge, we propose the multivariate time series segmentation network(MSSN), which adopts a multiscale convolutional network with multi-stage and depth-separable convolutions for efficient feature extraction through variable-length templates. To tackle the classification difficulty caused by structural signal variance, MSSN employs logarithmic normalization to adjust instance distributions. Furthermore, it integrates classification with smoothing loss functions to accurately identify defect segments amid similar structural and defect signal subsequences. Our algorithm evaluated on both the Mackey-Glass dataset and industrial dataset achieves over 95% localization and demonstrates the capture capability on the synthetic dataset. In a nuclear plant's heat transfer tube dataset, it captures 90% of defect instances with75% middle localization F1 score.
Magnesium chips were coated with a high concentration of graphite using a binder and were used as the raw material for injection molding. The microstructure of the magnesium injection-molded product with added graphit...
详细信息
Persistent memory(PM)promises byte-addressability,large capacity,and *** memory systems,such as key-value stores and in-memory databases,benefit from such features of *** to the great popularity of hash-ing index in m...
详细信息
Persistent memory(PM)promises byte-addressability,large capacity,and *** memory systems,such as key-value stores and in-memory databases,benefit from such features of *** to the great popularity of hash-ing index in main memory systems,a number of research efforts are made to provide high average performance persistent ***,suboptimal tail performance in terms of tail throughput and tail latency is still observed for existing persistent *** this paper,we analyze major sources of suboptimal tail performance from key design issues of persis-tent *** identify the global hash structure and concurrency control as remaining explorable design spaces for im-proving tail *** propose Directory-sharing Multi-level Extendible Hashing(Dalea)for *** designs an-cestor link-based extendible hashing as well as fine-grained transient lock to address the two main sources(rehashing and locking)affecting tail *** evaluation results show that,compared with state-of-the-art persistent hashing Dash,Dalea achieves increased tail throughput by 4.1x and reduced tail latency by ***,in order to provide de-sign guidelines for improving tail performance,we adopt Dalea as a testbed to identify different impacts of four factors on tail performance,including fine-grained rehashing,transient locking,memory pre-allocation,and fingerprinting.
The rapid development of emerging intelligent applications leads to a surge in computational demands and memory capacity requirements. Compute-in-memory (CIM) is a promising paradigm to alleviate the data movement bot...
详细信息
We introduce and study reconfiguration problems for (internally) vertex-disjoint shortest paths: Given two tuples of internally vertex-disjoint shortest paths for fixed terminal pairs in an unweighted graph, we are as...
详细信息
It is of great concern for high performance microprocessors to optimize logic delay and improve performance in the semi-custom design flow based on commercial standard cell library. To solve this issue, the paper prop...
详细信息
Dividing a single System-on-Chip (SoC) into multiple chiplets and connecting them using 2.5D packaging technology is becoming a widely adopted approach to enhance chip scale and performance. However, when multiple chi...
详细信息
暂无评论