Existing methods on knowledge base question generation (KBQG) learn a one-size-fits-all model by training together all subgraphs without distinguishing the diverse semantics of subgraphs. In this work, we show that ma...
详细信息
Autonomous aerial vehicle (AAV)-assisted mobile edge computing (MEC) and data collection (DC) have been popular research issues. Different from existing works that consider MEC and DC scenarios separately, this articl...
详细信息
Cognitive diagnosis is a critical task in intelligent education, aimed at inferring students' mastery of knowledge concepts based on their response logs. Although existing cognitive diagnosis models achieve excell...
详细信息
Vulnerabilities are disclosed with corresponding patches so that users can remediate them in time. However, there are instances where patches are not released with the disclosed vulnerabilities, causing hidden dangers...
详细信息
ISBN:
(纸本)9798350330663
Vulnerabilities are disclosed with corresponding patches so that users can remediate them in time. However, there are instances where patches are not released with the disclosed vulnerabilities, causing hidden dangers, especially if dependent software remains uninformed about the affected code repository. Hence, it is crucial to automatically locate security patches for disclosed vulnerabilities among a multitude of commits. Despite the promising performance of existing learning-based localization approaches, they still suffer from the following limitations: (1) They cannot perform well in data scarcity scenarios. Most neural models require extensive datasets to capture the semantic correlations between the vulnerability description and code commits, while the number of disclosed vulnerabilities with patches is limited. (2) They struggle to capture the deep semantic correlations between the vulnerability description and code commits due to inherent differences in semantics and characters between code changes and commit messages. It is difficult to use one model to capture the semantic correlations between vulnerability descriptions and code commits. To mitigate these two limitations, in this paper, we propose a novel security patch localization approach named Prom VPat, which utilizes the dual prompt tuning channel to capture the semantic correlation between vulnerability descriptions and commits, especially in data scarcity (i.e., few-shot) scenarios. We first input the commit message and code changes with the vulnerability description into the prompt generator to generate two new inputs with prompt templates. Then, we adopt a pre-trained language model (i.e., PLM) as the encoder, utilize the prompt tuning method to fine-tune the encoder, and generate two correlation probabilities as the semantic features. In addition, we extract 26 handcrafted features from the vulnerability descriptions and the code commits. Finally, we utilize the attention mechanism to fuse the
Cognitive diagnosis is a critical task in intelligent education, aimed at inferring students’ mastery of knowledge concepts based on their response logs. Although existing cognitive diagnosis models achieve excellent...
详细信息
ISBN:
(数字)9798331543143
ISBN:
(纸本)9798331543150
Cognitive diagnosis is a critical task in intelligent education, aimed at inferring students’ mastery of knowledge concepts based on their response logs. Although existing cognitive diagnosis models achieve excellent performance, they underestimate the difficulty of easy exercises and overestimate the difficulty of hard exercises. We attribute this to the class imbalance in the response logs of easy and hard exercises. Moreover, the convergence speed varies from exercise to exercise during model training, which further challenges generalization. To address these problems, we propose an exercise’s correct rate-based logit adjustment approach for a wide range of cognitive diagnosis models. Specifically, we enforce logit adjustment in the loss during training to overcome the class imbalance in response logs. Then, we apply group distributionally robust optimization for generalization. Finally, extensive experiments demonstrate the effectiveness of our model, especially on easy and hard exercises.
Enterprises currently face the challenge of reducing production cycles and costs and utilizing existing cases for making changes and iterations has emerged as a viable solution. However, the acquisition and modificati...
详细信息
Partial label learning is a weakly supervised learning framework in which each instance is associated with multiple candidate labels,among which only one is the ground-truth *** paper proposes a unified formulation th...
详细信息
Partial label learning is a weakly supervised learning framework in which each instance is associated with multiple candidate labels,among which only one is the ground-truth *** paper proposes a unified formulation that employs proper label constraints for training models while simultaneously performing *** existing partial label learning approaches that only leverage similarities in the feature space without utilizing label constraints,our pseudo-labeling process leverages similarities and differences in the feature space using the same candidate label constraints and then disambiguates noise *** experiments on artificial and real-world partial label datasets show that our approach significantly outperforms state-of-the-art counterparts on classification prediction.
Deep learning has shown significant improvements on various machine learning tasks by introducing a wide spectrum of neural network ***,for these neural network models,it is necessary to label a tremendous amount of t...
详细信息
Deep learning has shown significant improvements on various machine learning tasks by introducing a wide spectrum of neural network ***,for these neural network models,it is necessary to label a tremendous amount of training data,which is prohibitively expensive in *** this paper,we propose OnLine Machine Learning(OLML)database which stores trained models and reuses these models in a new training task to achieve a better training effect with a small amount of training *** efficient model reuse algorithm AdaReuse is developed in the OLML ***,AdaReuse firstly estimates the reuse potential of trained models from domain relatedness and model quality,through which a group of trained models with high reuse potential for the training task could be selected ***,multi selected models will be trained iteratively to encourage diverse models,with which a better training effect could be achieved by *** evaluate AdaReuse on two types of natural language processing(NLP)tasks,and the results show AdaReuse could improve the training effect significantly compared with models training from scratch when the training data is *** on AdaReuse,we implement an OLML database prototype system which could accept a training task as an SQL-like query and automatically generate a training plan by selecting and reusing trained *** studies are conducted to illustrate the OLML database could properly store the trained models,and reuse the trained models efficiently in new training tasks.
The minimum weakly connected dominating set problem is a typical NP-hard problem with a wide range of applications. To solve this problem, we propose a frequency property and two-hop configuration checking strategy-dr...
详细信息
Cross-lingual image captioning, with its ability to caption an unlabeled image in a target language other than English, is an emerging topic in the multimedia field. In order to save the precious human resource from r...
详细信息
暂无评论