检索结果-内蒙古大学图书馆

XGrad: Boosting Gradient-Based Optimizers With Weight Prediction

学校读者我要写书评

暂无评论

arXiv 2023年

作者： Guan, Lei Li, Dongsheng Shi, Yanqi Meng, Jian The Department of Mathematics National University of Defense Technology Changsha China The National Key Laboratory of Parallel and Distributed Computing National University of Defense Technology China

In this paper, we propose a general deep learning training framework XGrad which introduces weight prediction into the popular gradient-based optimizers to boost their convergence and generalization when training the deep neural network (DNN) models. In particular, ahead of each mini-batch training, the future weights are predicted according to the update rule of the used optimizer and are then applied to both the forward pass and backward propagation. In this way, during the whole training period, the optimizer always utilizes the gradients w.r.t. the future weights to update the DNN parameters, making the gradient-based optimizer achieve better convergence and generalization compared to the original optimizer without weight prediction. XGrad is rather straightforward to implement yet pretty effective in boosting the convergence of gradient-based optimizers and the accuracy of DNN models. Empirical results concerning five popular optimizers including SGD with momentum, Adam, AdamW, AdaBelief, and AdaM3 demonstrate the effectiveness of our proposal. The experimental results validate that XGrad can attain higher model accuracy than the baseline optimizers when training the DNN models. The code of XGrad will be available at: https://***/guanleics/XGrad. Copyright © 2023, The Authors. All rights reserved.

关键词： Forecasting

PTQ-SO: A Scale Optimization-based Approach for Post-training Quantization of Edge computing

学校读者我要写书评

暂无评论

PTQ-SO: A Scale Optimization-based Approach for Post-trainin...

International Conference on Computer Supported Cooperative Work in Design

作者： Kangkang Liu Ningjiang Chen College of Computer and Electronic Information Guangxi University Nanning China Education Department of Guangxi Zhuang Autonomous Region Key Laboratory of Parallel Distributed and Intelligent Computing Guangxi University Nanning China

ISBN: (数字)9798350349184

ISBN: (纸本)9798350349191

With the increasing performance of deep convolutional neural networks, they have been widely used in many computer vision tasks. However, a huge convolutional neural network model requires a lot of memory and computing resources, which makes it difficult to meet the requirements of low latency and reliability of edge computing when the model is deployed locally on resource-limited devices in edge environments. Quantization is a kind of model compression technology, which can effectively reduce model size, calculation cost and inference delay, but the quantization noise will cause the accuracy of the quantization model to decrease. Aiming at the problem of precision loss caused by model quantization, this paper proposes a post-training quantization method based on scale optimization. By reducing the influence of redundant parameters in the model on the quantization parameters in the process of model quantization, the scale factor optimization is realized to reduce the quantization error and thus improve the accuracy of the quantized model, reduce the inference delay and improve the reliability of edge applications. The experimental results show that under different quantization strategies and different quantization bit widths, the proposed method can improve the accuracy of the quantized model, and the absolute accuracy of the optimal quantization model is improved by 1.36%. The improvement effect is obvious, which is conducive to the application of deep neural network in edge environment.

关键词： Quantization (signal) Accuracy Computational modeling Computer network reliability Delays Reliability Convolutional neural networks

RuleRAG: Rule-Guided Retrieval-Augmented Generation with Language Models for Question Answering

学校读者我要写书评

暂无评论

arXiv 2024年

作者： Chen, Zhongwu Xu, Chengjin Wang, Dingmin Huang, Zhen Dou, Yong Jiang, Xuhui Guo, Jian National Key Laboratory of Parallel and Distributed Computing College of Computer Science and Technology National University of Defense Technology China IDEA Research International Digital Economy Academy Department of Computer Science University of Oxford United Kingdom

Retrieval-augmented generation (RAG) has shown promising potential in knowledge intensive question answering (QA). However, existing approaches only consider the query itself, neither specifying the retrieval preferences for the retrievers nor informing the generators of how to refer to the retrieved documents for the answers, which poses a significant challenge to the QA performance. To address these issues, we propose Rule-guided Retrieval-Augmented Generation with LMs, which explicitly introduces rules for in-context learning (RuleRAG-ICL) to guide retrievers to recall related documents in the directions of rules and uniformly guide generators to reason attributed by the same rules. The combination of queries and rules can be used as fine-tuning data to update retrievers and generators, achieving better rule-based instruction-following ability (RuleRAG-FT). Moreover, most existing RAG datasets were constructed without considering rules and Knowledge Graphs (KGs) are recognized as providing high-quality rules. Therefore, we construct five rule-aware RAG benchmarks for QA, RuleQA, based on KGs to stress the significance of retrieval and reasoning with rules. Experiments on RuleQA demonstrate RuleRAG-ICL improves the retrieval quality of +89.2% in Recall@10 and answer accuracy of +103.1% in Exact Match, and RuleRAG-FT yields more enhancement. In addition, experiments on four existing RAG datasets show RuleRAG is also effective by offering rules in RuleQA to them, further proving the generalization of rule guidance in RuleRAG. Code and RuleQA are at https://***/r/RuleRAG. Copyright © 2024, The Authors. All rights reserved.

关键词： Question answering

DrugProtKGE: Weakly Supervised Knowledge Graph Embedding for Highly-Effective Drug-Protein Interaction Representation

学校读者我要写书评

暂无评论

DrugProtKGE: Weakly Supervised Knowledge Graph Embedding for...

2023 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2023

作者： Qiu, Yanlong Wang, Siqi Yang, Xi Qiu, Xinyuan Wu, Chengkun Cui, Yingbo Yang, Canqun National University of Defense Technology Institute for Quantum Information State Key Laboratory of High-Performance Computing College of Computer Science Hunan Changsha410073 China National Supercomputer Center in Tianjin Tianjin300457 China National University of Defense Technology National Key Laboratory of Parallel and Distributed Computing College of Computer Science Hunan Changsha410073 China National University of Defense Technology Department of Biology and Chemistry College of Science Hunan Changsha410073 China

ISBN: (纸本)9798350337488

With the exponential growth of biomedical knowledge in unstructured text repositories such as PubMed, it is imminent to establish a knowledge graph-style, efficient searchable and targeted database that can support the need of information retrieval from researchers and clinicians. To mine knowledge from graph databases, most previous methods view a triple in a graph (see Fig. 1) as the basic processing unit and embed the triplet element (i.e. drugs/chemicals, proteins/genes and their interaction) as separated embedding matrices, which cannot capture the semantic correlation among triple elements. To remedy the loss of semantic correlation caused by disjoint embeddings, we propose a novel approach to learn triple embeddings by combining entities and interactions into a unified representation. Furthermore, traditional methods usually learn triple embeddings from scratch, which cannot take advantage of the rich domain knowledge embedded in pre-trained models, and is also another significant reason for the fact that they cannot distinguish the differences implied by the same entity in the multi-interaction triples. In this paper, we propose a novel fine-tuning based approach to learn better triple embeddings by creating weakly supervised signals from pre-trained knowledge graph embeddings. The method automatically samples triples from knowledge graphs and estimates their pairwise similarity from pre-trained embedding models. The triples are then fed pairwise into a Siamese-like neural architecture, where the triple representation is fine-tuned in the manner bootstrapped by triple similarity scores. Finally, we demonstrate that triple embeddings learned with our method can be readily applied to several downstream applications (e.g. triple classification and triple clustering). We evaluated the proposed method on two open-source drug-protein knowledge graphs constructed from PubMed abstracts, as provided by BioCreative. Our method achieves consistent improvement in both t

关键词： Drug-Protein Interaction Knowledge Graph Embedding Triple Embedding Weakly Supervised Learning

A survey on deep learning approaches for data integration in autonomous driving system

学校读者我要写书评

暂无评论

arXiv 2023年

作者： Zhu, Xi Wang, Likang Zhou, Caifa Cao, Xiya Gong, Yue Chen, Lei Riemann Laboratory Huawei Technologies 2012 Laboratories China Department of Computer Science and Engineering Hong Kong University of Science and Technology Hong Kong Parallel Distributed Computing Laboratory Huawei Technologies 2012 Laboratories China

The perception module of self-driving vehicles relies on a multi-sensor system to understand its environment. Recent advancements in deep learning have led to the rapid development of approaches that integrate multi-sensory measurements to enhance perception capabilities. This paper surveys the latest deep learning integration techniques applied to the perception module in autonomous driving systems, categorizing integration approaches based on "what, how, and when to integrate." A new taxonomy of integration is proposed, based on three dimensions: multi-view, multi-modality, and multi-frame. The integration operations and their pros and cons are summarized, providing new insights into the properties of an "ideal" data integration approach that can alleviate the limitations of existing methods. After reviewing hundreds of relevant papers, this survey concludes with a discussion of the key features of an optimal data integration approach. © 2023, CC BY-NC-SA.

关键词： Autonomous vehicles

Learning free-surface flow with physics-informed neural networks

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Leiteritz, Raphael Hurler, Marcel Pflüger, Dirk Institute of Parallel and Distributed Systems Department of Scientific Computing University of Stuttgart Stuttgart Germany

The interface between data-driven learning methods and classical simulation poses an interesting field offering a multitude of new applications. In this work, we build on the notion of physics-informed neural networks (PINNs) and employ them in the area of shallow-water equation (SWE) models. These models play an important role in modeling and simulating free-surface flow scenarios such as in flood-wave propagation or tsunami waves. Different formulations of the PINN residual are compared to each other and multiple optimizations are being evaluated to speed up the convergence rate. We test these with different 1-D and 2-D experiments and finally demonstrate that regarding a SWE scenario with varying bathymetry, the method is able to produce competitive results in comparison to the direct numerical simulation with a total relative L2 error of 8.9e−3. Copyright © 2021, The Authors. All rights reserved.

关键词： Numerical methods

MTD: Multi-Timestep Detector for Delayed Streaming Perception

学校读者我要写书评

暂无评论

arXiv 2023年

作者： Huang, Yihui Chen, Ningjiang School of Computer and Electronic Information Guangxi University Nanning China Guangxi Intelligent Digital Services Research Center of Engineering Technology Nanning China Key Laboratory of Parallel Distributed and Intelligent Computing Guangxi University Education Department of Guangxi Zhuang Autonomous Region Nanning China

Autonomous driving systems require real-time environmental perception to ensure user safety and experience. Streaming perception is a task of reporting the current state of the world, which is used to evaluate the delay and accuracy of autonomous driving systems. In real-world applications, factors such as hardware limitations and high temperatures inevitably cause delays in autonomous driving systems, resulting in the offset between the model output and the world state. In order to solve this problem, this paper propose the Multi-Timestep Detector (MTD), an end-to-end detector which uses dynamic routing for multi-branch future prediction, giving model the ability to resist delay fluctuations. A Delay Analysis Module (DAM) is proposed to optimize the existing delay sensing method, continuously monitoring the model inference stack and calculating the delay trend. Moreover, a novel Timestep Branch Module (TBM) is constructed, which includes static flow and adaptive flow to adaptively predict specific timesteps according to the delay trend. The proposed method has been evaluated on the Argoverse-HD dataset, and the experimental results show that it has achieved state-of-the-art performance across various delay settings. Copyright © 2023, The Authors. All rights reserved.

关键词： Autonomous vehicles

Knowledge Distillation with Source-free Unsupervised Domain Adaptation for BERT Model Compression

学校读者我要写书评

暂无评论

Knowledge Distillation with Source-free Unsupervised Domain ...

International Conference on Computer Supported Cooperative Work in Design

作者： Jing Tian Juan Chen Ning Jiang Chen Lin Bai Suqun Huang School of Computer and Electronic Information Guangxi University Nanning China Guangxi Intelligent Digital Services Research Center of Engineering Technology Nanning China Education Department of Guangxi Zhuang Autonomous Region Key Laboratory of Parallel Distributed and Intelligent Computing (Guangxi University) Nanning China

The pre-training language model BERT has brought significant performance improvements to a series of natural language processing tasks, but due to the large scale of the model, it is difficult to be applied in many practical application scenarios. With the continuous development of edge computing, deploying the models on resource-constrained edge devices has become a trend. Considering the distributed edge environment, how to take into account issues such as data distribution differences, labeling costs, and privacy while the model is shrinking is a critical task. The paper proposes a new BERT distillation method with source-free unsupervised domain adaptation. By combining source-free unsupervised domain adaptation and knowledge distillation for optimization and improvement, the performance of the BERT model is improved in the case of cross-domain data. Compared with other methods, our method can improve the average prediction accuracy by up to around 4% through the experimental evaluation of the cross-domain sentiment analysis task.

关键词：

LTKT: Knowledge Tracing Based on Positive and Negative Learning Transfers

学校读者我要写书评

暂无评论

SSRN

SSRN 2023年

作者： Xu, Jia Tang, Rongrong Lv, Pin Yu, Minghe Yu, Ge Chen, Enhong School of Computer Electronics and Information Guangxi University GuangXi Nanning530004 China Key Laboratory of Parallel Distributed and Intelligent Computing Education Department of Guangxi Zhuang Autonomous Region Guangxi530004 China Northeastern University Shenyang110819 China University of Science and Technology of China Hefei230088 China

Knowledge Tracing (KT) is a critical but challenging problem for many educational applications. As an essential part of educational psychology, the propagated influence among pedagogical concepts (i.e., learning transfer) is important for optimizing KT tasks. However, existing KT methods only consider the positive learning transfer and neglect the negative learning transfer. To this end, in this paper, we propose a novel deep knowledge tracing model, called positive and negative Learning Transfers based Knowledge Tracing (LTKT). To the best of our knowledge, LTKT makes the first attempt to concurrently utilize the positive and negative learning transfer relations among concepts to improve KT results. First, LTKT employs a statistical-based approach to construct a learning transfer graph (LTG). Then, LTKT quantifies the impact of an exercise's practice result on the knowledge state of the concept via a direct learning effect component, after which a learning transfer effect component is carefully designed to quantify the impact of the practice result on the knowledge states of neighboring concepts based on the positive and negative learning transfer relations modeled by LTG. We conduct extensive experiments on public real-world datasets, and the experimental results show that LTKT outperforms all state-of-the-art KT methods and has good interpretability. © 2023, The Authors. All rights reserved.

关键词： Deep neural networks