In off-policy reinforcement learning, prioritized experience replay plays an important role. However, the centralized prioritized experience replay becomes the bottleneck for efficient training. We propose to approxim...
详细信息
Sparse matrix reordering is an important step in Cholesky decomposition. By reordering the rows and columns of the matrix, the time of computation and storage cost can be greatly reduced. With the proposal of various ...
详细信息
ISBN:
(数字)9798350359312
ISBN:
(纸本)9798350359329
Sparse matrix reordering is an important step in Cholesky decomposition. By reordering the rows and columns of the matrix, the time of computation and storage cost can be greatly reduced. With the proposal of various reordering algorithms, the selection of suitable reordering methods for various matrices has become an important research topic. In this paper, we propose a method to predict the optimal reordering method by visualizing sparse matrices in chunks in a parallel manner and feeding them into a deep convolutional neural network. The results show that the theoretical performance can reach 95% of the optimal performance, the prediction accuracy of the method can reach up to 85%, the parallel framework achieves an average speedup ratio of 11.35 times over the serial framework, and the performance is greatly improved compared with the traversal selection method on large sparse matrices.
Accurately mapping the surface rivers is important in ecological environment monitoring and disaster prevention. The development of remote sensing technology and computer vision greatly improves the efficiency of this...
详细信息
Cold data contributes a large portion of the big data today and is usually stored in secondary storage. Various sketch data structures are implemented to represent the stored elements and provide constant-time members...
详细信息
Deep Reinforcement Learning has been successfully applied in various applications and achieved impressive performance compared with previous traditional methods but suffers from high computation cost and long training...
详细信息
Deep Reinforcement Learning has been successfully applied in various applications and achieved impressive performance compared with previous traditional methods but suffers from high computation cost and long training time. MLPerf takes deep reinforcement learning as one of the benchmark tracks and provides a single node training version of MiniGo as a reference. A key challenge is to achieve efficient MiniGo training on a large-scale computing system. According to the training computation pattern in MiniGo and the characteristics of our large-scale heterogeneous computing system, we propose a MultiLevel parallel strategy, MLPs, including task-level parallelism between nodes, CPU-DSP heterogeneous parallelism, and DSP multi-core parallelism. The proposed method reduces the overall execution time from 43 hours to 16 hours while scaling the node size from 1067 to 4139. The scaling efficiency is 69.1%. According to our fitting method, the scaling efficiency is 46.5% when scaling to 8235 nodes. The experimental results show that the proposed method achieves the efficient training of MiniGo on the largescale heterogeneous computing system.
With serverless computing offering more efficient and cost-effective application deployment, the diversity of serverless platforms presents challenges to users, including platform lock-in and costly migration. Moreove...
详细信息
Disaggregated memory (DM) is a widely discussed datacenter architecture in academia and industry. It decouples computing and memory resources from monolithic servers into two network-connected resource pools. Range in...
详细信息
In the Internet of Everything (IoE), due to its issues of complexity and heterogeneity, message delay cannot be guaranteed, and it is not enough to leverage a centralized model for data collaboration. By leveraging th...
详细信息
Emerging blockchain accounting mechanism allow mutually distributed parties to transport trusted information and ensure the correctness of data. Every blockchain node stores the complete block locally. Although this m...
详细信息
Temporal Knowledge Graph Completion (TKGC) aims to predict missing parts of quadruples, which is crucial for real-life knowledge graphs. Compared with methods that only use graph neural networks, the emergence of pre-...
详细信息
ISBN:
(数字)9798350359312
ISBN:
(纸本)9798350359329
Temporal Knowledge Graph Completion (TKGC) aims to predict missing parts of quadruples, which is crucial for real-life knowledge graphs. Compared with methods that only use graph neural networks, the emergence of pre-trained model has introduced a trend of simultaneously leveraging text and graph structure information. However, most current methods based on pre-trained models struggle to effectively utilize both text and multi-hop graph structure information concurrently, resulting in insufficient association mining of relations. To address the challenge, we propose a novel model: Temporal Closing Path for Pre-trained Language Model-based TKGC (TCP-PLM). We obtain the temporal closing relation path of the target relation through sampling, and use the relation path as a bridge to simultaneously utilize text and multi-hop graph structure information. Moreover, the relation path serves as a tool for mining associations between relations. At the same time, due to the design of entity-independent relation paths, our model can also handle the inductive setting. Our experiments on three benchmarks, along with extensive analysis, demonstrate that our model not only achieves substantial performance enhancements across four metrics compared to other models but also adeptly handles inductive settings.
暂无评论