检索结果-内蒙古大学图书馆

Method-Level Test-to-code Traceability Link Construction by Semantic Correlation Learning

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING 2024年第10期50卷 2656-2676页

作者： Sun, Weifeng Guo, Zhenting Yan, Meng Liu, Zhongxin Lei, Yan Zhang, Hongyu Chongqing Univ Sch Big Data & Software Engn Chongqing 400044 Peoples R China Zhejiang Univ State Key Lab Blockchain & Data Secur Hangzhou 310027 Peoples R China

Test-to-code traceability links (TCTLs) establish links between test artifacts and code artifacts. These links enable developers and testers to quickly identify the specific pieces of code tested by particular test cases, thus facilitating more efficient debugging, regression testing, and maintenance activities. Various approaches, based on distinct concepts, have been proposed to establish method-level TCTLs, specifically linking unit tests to corresponding focal methods. Static methods, such as naming-convention-based methods, use heuristic- and similarity-based strategies. However, such methods face the following challenges: (1) Developers, driven by specific scenarios and development requirements, may deviate from naming conventions, leading to TCTL identification failures. (2) Static methods often overlook the rich semantics embedded within tests, leading to erroneous associations between tests and semantically unrelated code fragments. Although dynamic methods achieve promising results, they require the project to be compilable and the tests to be executable, limiting their usability. This limitation is significant for downstream tasks requiring massive test-code pairs, as not all projects can meet these requirements. To tackle the abovementioned limitations, we propose a novel static method-level TCTL approach, named TestLinker. For the first challenge of existing static approaches, TestLinker introduces a two-phase TCTL framework to accommodate different project types in a triage manner. As for the second challenge, we employ the semantic correlation learning, which learns and establishes the semantic correlations between tests and focal methods based on pre-trained code models (PCMs). TestLinker further establishes mapping rules to accurately link the recommended function name to the concrete production function declaration. Empirical evaluation on a meticulously labeled dataset reveals that TestLinker significantly outperforms traditional static techniques

关键词： codes Task analysis Phase change materials Semantics Software Production Correlation Software engineering software testing traceability pre-trained code model

来源：评论

学校读者我要写书评

暂无评论

Improving Retrieval-Augmented code Comment Generation by Retrieving for Generation 40

Improving Retrieval-Augmented Code Comment Generation by Ret...

引用

40th International Conference on Software Maintenance and Evolution

作者： Lu, Hanzhen Liu, Zhongxin Zhejiang Univ State Key Lab Blockchain & Data Secur Hangzhou Peoples R China Hangzhou High Tech Zone Binjiang Inst Blockchain Hangzhou Peoples R China

ISBN: (纸本)9798350395693;9798350395686

code comment generation aims to generate high-quality comments from source code automatically and has been studied for years. Recent studies proposed to integrate information retrieval techniques with neural generation models to tackle this problem, i.e., Retrieval-Augmented Comment Generation (RACG) approaches, and achieved state-of-the-art results. Generally, RACG approaches use a retriever to retrieve a code-comment pair from a retrieval base as an exemplar, combine the exemplar with the input code snippet, and feed the combined text to a generator (usually a sequence-to-sequence model) to generate the comment. However, the retrievers in previous work are built independently of their generators. This results in that the retrieved exemplars are not necessarily the most useful ones for generating comments, limiting the performance of existing approaches. To address this limitation, we propose a novel training strategy to enable the retriever to learn from the feedback of the generator and retrieve exemplars for generation. Specifically, during training, we use the retriever to retrieve the top-k exemplars and calculate their retrieval scores, and use the generator to calculate a generation loss for the sample based on each exemplar. By aligning high-score exemplars retrieved by the retriever with low-loss exemplars observed by the generator, the retriever can learn to retrieve exemplars that can best improve the quality of the generated comments. Based on this strategy, we propose a novel RACG approach named JOINTCOM and evaluate it on two real-world datasets, JCSD and PCSD. The experimental results demonstrate that our approach surpasses the state-of-the-art baselines by 7.3% to 30.0% in terms of five metrics on the two datasets. We also conduct a human evaluation to compare JOINTCOM with the best-performing baselines. The results indicate that JOINTCOM outperforms the baselines, producing comments that are more natural, informative, and useful.

关键词： Comment generation Information retrieval pre-trained code model

来源：评论

学校读者我要写书评

暂无评论

Vul-LMGNNs: Fusing language models and online-distilled graph neural networks for code vulnerability detection

引用

INFORMATION FUSION 2025年 115卷

作者： Liu, Ruitong Wang, Yanbin Xu, Haitao Sun, Jianguo Zhang, Fan Li, Peiyue Guo, Zhenhao Shenzhen MSU BIT Univ Shenzhen 518172 Peoples R China Xidian Univ Hangzhou Res Inst Technol Hangzhou 311200 Peoples R China Zhejiang Univ Coll Comp Sci & Technol Sch Cyber Sci & Technol Hangzhou 310027 Peoples R China Peoples Publ Secur Univ China Beijing 100038 Peoples R China

code Language models (codeLMs) and Graph Neural Networks (GNNs) are widely used in code vulnerability detection. However, a critical yet often overlooked issue is that GNNs primarily rely on aggregating information from adjacent nodes, limiting structural information transfer to single-layer updates. In code graphs, nodes and relationships typically require cross-layer information propagation to fully capture complex program logic and potential vulnerability patterns. Furthermore, while some studies utilize codeLMs to supplement GNNs with code semantic information, existing integration methods have not fully explored the potential of their collaborative effects. To address these challenges, we introduce Vul-LMGNNs that integrates pre-trained codeLMs with GNNs, leveraging knowledge distillation to facilitate cross-layer propagation of both code semantic knowledge and structural information. Specifically, Vul-LMGNNs utilizes code Property Graphs (CPGs) to incorporate code syntax, control flow, and data dependencies, while employing gated GNNs to extract structural information in the CPG. To achieve cross-layer information transmission, we implement an online knowledge distillation (KD) program that enables a single student GNN to acquire structural information extracted from a simultaneously trained counterpart through an alternating training procedure. Additionally, we leverage pre-trained codeLMs to extract semantic features from code sequences. Finally, we propose an "implicit-explicit" joint training framework to better leverage the strengths of both codeLMs and GNNs. In the implicit phase, we utilize codeLMs to initialize the node embeddings of each student GNN. Through online knowledge distillation, we facilitate the propagation of both code semantics and structural information across layers. In the explicit phase, we perform linear interpolation between the codeLM and the distilled GNN to learn a late fusion model. The proposed method, evaluated across four rea

关键词： code vulnerability detection Graph information fusion pre-trained code model Joint training

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：