检索结果-内蒙古大学图书馆

automatic code summarization Using Abbreviation Expansion and Subword Segmentation

EXPERT SYSTEMS 2025年第2期42卷

作者： Liang, Yu-Guo Fan, Gui-Sheng Yu, Hui-Qun Li, Ming-Chen Huang, Zi-Jie East China Univ Sci & Technol Sch Informat Sci & Engn Shanghai Peoples R China Shanghai Dev Ctr Comp Software Technol Shanghai Key Lab Comp Software Testing & Evaluatin Shanghai Peoples R China

automatic code summarization refers to generating concise natural language descriptions for code snippets. It is vital for improving the efficiency of program understanding among software developers and maintainers. Despite the impressive strides made by deep learning-based methods, limitations still exist in their ability to understand and model semantic information due to the unique nature of programming languages. We propose two methods to boost code summarization models: context-based abbreviation expansion and unigram language model-based subword segmentation. We use heuristics to expand abbreviations within identifiers, reducing semantic ambiguity and improving the language alignment of code summarization models. Furthermore, we leverage subword segmentation to tokenize code into finer subword sequences, providing more semantic information during training and inference, thereby enhancing program understanding. These methods are model-agnostic and can be readily integrated into existing automatic code summarization approaches. Experiments conducted on two widely used Java code summarization datasets demonstrated the effectiveness of our approach. Specifically, by fusing original and modified code representations into the Transformer model, our Semantic Enhanced Transformer for code Summarizsation (SETCS) serves as a robust semantic-level baseline. By simply modifying the datasets, our methods achieved performance improvements of up to 7.3%, 10.0%, 6.7%, and 3.2% for representative code summarization models in terms of BLEU-4, METEOR, ROUGE-L and SIDE, respectively.

关键词： automatic code summarization code abbreviation expansion deep learning program understanding subword segmentation

来源：评论

学校读者我要写书评

暂无评论

Leveraging and Evaluating automatic code summarization for JPA Program Comprehension 30

Leveraging and Evaluating Automatic Code Summarization for J...

引用

30th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)

作者： Mayer, Richard Moser, Michael Geist, Verena Software Competence Ctr Hagenberg Hagenberg Austria

ISBN: (纸本)9781665452786

Accurate and up-to-date software documentation is an important factor in the maintenance and evolution of software systems. Especially with legacy software, documentation is often outdated or missing entirely and manual redocumentation is not feasible. In recent years, automatic code summaries based on artificial neural network (ANN) models have been proposed to address this problem, and metric-based evaluations suggest promising quality of the generated summaries. To evaluate the applicability of state-of-the-art code summarization in an industry context, we conduct an expert evaluation to assess the quality of the generated summaries for JPA program comprehension. We then compare the level of quality perceived by human experts for both predicted and reference summaries and discuss how these results are influenced by industry-specific requirements and how they correlate with automatically computed source code summary metrics. The results show that the quality of predicted summaries is predominantly (about 80%) poor in terms of accuracy and completeness. Moreover, the results support the generally increasing consensus that the widely used BLEU or ROUGE-L score is not a suitable means of evaluating the quality of code summarization. While these metrics are an adequate means of comparison with existing related work, they cannot reflect the human-perceived level of quality in practice.

关键词： Terms Software Maintenance automatic code summarization Program Comprehension Evaluation Metrics

来源：评论

学校读者我要写书评

暂无评论

DG-Trans: automatic code summarization via Dynamic Graph Attention-based Transformer 21

DG-Trans: Automatic Code Summarization via Dynamic Graph Att...

引用

21st IEEE International Conference on Software Quality, Reliability and Security (QRS)

作者： Zeng, Jianwei Zhang, Tao Xu, Zhou Macau Univ Sci & Technol Fac Informat Technol Macau Peoples R China Chongqing Univ Sch Big Data & Software Engn Chongqing Peoples R China

ISBN: (纸本)9781665458139

automatic code summarization is an important topic in the software engineering field, which aims to automatically generate the description for the source code. Based on Graph Neural Networks (GNN), most existing methods apply them to Abstract Syntax Tree (AST) to achieve code summarization. However, these methods face two major challenges: 1) they can only capture limited structural information of the source code;2) they did not effectively solve Out-Of-Vocabulary (OOV) problems by reducing vocabulary size. In order to resolve these problems, in this paper, we propose a novel code summarization model named Dynamic Graph attention-based Transformer (DG-Trans for short), which effectively captures abundant information of the code subword sequence and utilizes the fusion of dynamic graph attention mechanism and Transformer. Extensive experiments show that DG-Trans is able to outperform state-of-the-art models (such as Ast-Attendgru, Transformer, and codeGNN) by averagely increasing 8.39% and 8.86% on BLEU scores and ROUGUEL, respectively.

关键词： automatic code summarization graph neural network dynamic graph attention subword sequence

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：