Developers often seek solutions for their programming problems by retrieving existing questions on technical Q&A sites such as Stack Overflow. In many cases, they fail to find relevant questions due to the knowled...
Developers often seek solutions for their programming problems by retrieving existing questions on technical Q&A sites such as Stack Overflow. In many cases, they fail to find relevant questions due to the knowledge gap between the questions and the queries or feel it hard to choose the desired questions from the returned results due to the lack of explanations about the relevance. In this paper, we propose KGXQR, a knowledge graph based explainable question retrieval approach for programming tasks. It uses BERT-based sentence similarity to retrieve candidate Stack Overflow questions that are relevant to a given query. To bridge the knowledge gap and enhance the performance of question retrieval, it constructs a software development related concept knowledge graph and trains a question relevance prediction model to re-rank the candidate questions. The model is trained based on a combined sentence representation of BERT-based sentence embedding and graph-based concept embedding. To help understand the relevance of the returned Stack Overflow questions, KGXQR further generates explanations based on the association paths between the concepts involved in the query and the Stack Overflow questions. The evaluation shows that KGXQR outperforms the baselines in terms of accuracy, recall, MRR, and MAP and the generated explanations help the users to find the desired questions faster and more accurately.
Document-level relation extraction (DocRE) attracts more research interest recently. While models achieve consistent performance gains in DocRE, their underlying decision rules are still understudied: Do they make the...
详细信息
As basic elements in program, variables convey essential information that is critical for program comprehension and maintenance. However, understanding the meanings of variables in program is not always easy for devel...
As basic elements in program, variables convey essential information that is critical for program comprehension and maintenance. However, understanding the meanings of variables in program is not always easy for developers, since poor-quality variable names are prevalent while such variable are less informative for program comprehension. Therefore, in this paper, we target at generating concise natural language explanations for variables to facilitate program comprehension. In particular, there are two challenges in variable explanation generation, including the lack of training data and the association with complex code contexts around the variable. To address these issues, we propose a novel approach ZeroVar,which leverages code pre-trained models and zero-shot prompt learning to generate explanations for the variable based on its code context. ZeroVarcontains two stages: (i) a pre-training stage that continually pre-trains a base model (i.e., CodeT5) to recover the randomly-masked parameter descriptions in method docstrings; and (ii) a zero-shot prompt learning stage that leverages the pre-trained model to generate explanations for a given variable via the prompt constructed with the variable and its belonging method context. We then extensively evaluate the quality and usefulness of the variable explanations generated by *** construct an evaluation dataset of 773 variables and their reference explanations. Our results show that ZeroVarcan generate higher-quality explanations than baselines, not only on automated metrics such as BLEU and ROUGE, but also on human metrics such as correctness, completeness, and conciseness. Moreover, we further assess the usefulness of ZeroVAR-generated explanations on two downstream tasks related to variable naming quality, i.e., abbreviation expansion and spelling correction. For abbreviation expansion, the generated variable explanations can help improve the present rate (+13.1%), precision (+3.6%), and recall (+10.0%) of
Library migration, which replaces the current library with a different one to retain the same software behavior, is common in software evolution. An essential part of this is finding an analogous API for the desired f...
详细信息
Unmanned Aerial Vehicle (UAV) detection in the wild is a challenging task due to the presence of background noise and the varying size of the object. To address these obstacles, we propose a novel learning framework f...
详细信息
Program-of-Thought (PoT), which aims to use programming language instead of natural language as an intermediate step in reasoning, is an important way for LLMs to solve mathematical problems. Since different programmi...
详细信息
Generative Language Models (LMs) such as ChatGPT have exhibited remarkable performance across various downstream tasks. Nevertheless, one of their most prominent drawbacks is generating inaccurate or false information...
详细信息
To advance personalized applications such as recommendation systems and user behavior prediction, recent research increasingly adopts large language models (LLMs) for human-readable persona modeling. In dynamic real-w...
详细信息
The chiral feature of an optical field can be evaluated by the parameter of g-factor enhancement,which is helpful to enhance chiroptic signals from a chiral *** this work,the superchiral spot has been theoretically pr...
详细信息
The chiral feature of an optical field can be evaluated by the parameter of g-factor enhancement,which is helpful to enhance chiroptic signals from a chiral *** this work,the superchiral spot has been theoretically proposed in metal-insulator-metal *** g-factor enhancement of the superchiral spot can be enhanced by 67-fold more than that of circularly polarized light,and the spot is confined in the deep wavelength scale along each spatial ***,the position of the superchiral spot can be tuned by manipulating the incident *** tunable superchiral spot may find applications in chiral imaging and sensing.
In this article,we carry out stochastic comparisons on the maximum order statistics arising from two batches of multiple-outlier gamma random variables with different shape and scale *** is proved that,under certain c...
详细信息
In this article,we carry out stochastic comparisons on the maximum order statistics arising from two batches of multiple-outlier gamma random variables with different shape and scale *** is proved that,under certain conditions,the majorization order between the vectors of shape parameters together with the weak majorization order[p-larger order]between the vectors of scale parameters implies the likelihood ratio order[hazard rate order]between the largest order *** results established here strengthen and generalize some known ones in the literature.
暂无评论