code review is a critical process in software development, contributing to the overall quality of the product by identifying errors early. A key aspect of this process is the selection of appropriate reviewers to scru...
详细信息
code review is a critical process in software development, contributing to the overall quality of the product by identifying errors early. A key aspect of this process is the selection of appropriate reviewers to scrutinize changes made to source code. However, in large-scale open-source projects, selecting the most suitable reviewers for a specific change can be a challenging task. To address this, we introduce the code context Based Reviewer Recommendation (CCB-RR), a model that leverages information from changesets to recommend the most suitable reviewers. The model takes into consideration the paths of modified files and the context derived from the changesets, including their titles and descriptions. Additionally, CCB-RR employs KeyBERT to extract the most relevant keywords and compare the semantic similarity across changesets. The model integrates the paths of modified files, keyword information, and the context of code changes to form a comprehensive picture of the changeset. We conducted extensive experiments on four open-source projects, demonstrating the effectiveness of CCB-RR. The model achieved a Top-1 accuracy of 60%, 55%, 51%, and 45% on the Android, OpenStack, QT, and LibreOffice projects respectively. For Mean Reciprocal Rank (MRR), CCB achieved 71%, 62%, 52%, and 68% on the same projects respectively, thereby highlighting its potential for practical application in code reviewer recommendation.
Existing code summarization approaches overlook developers' discriminative context focuses when generating code comments. This paper proposes a context-focused code summarization approach based on the prompt tunin...
详细信息
ISBN:
(纸本)9798350376975;9798350376968
Existing code summarization approaches overlook developers' discriminative context focuses when generating code comments. This paper proposes a context-focused code summarization approach based on the prompt tuning technique. It enables the pre-trained code models to identify specific context focuses around a method and to generate the method's comment with corresponding contextual information, which improves the accuracy and informativeness of the generated comments. As the first attempt, we design prompt templates for six common types of contexts, construct a context-focused code-comment dataset, and prompt-tune two pre-trained code models with the dataset to generate code comments. The experimental results demonstrate that our approach significantly improves the existing models to generate context-focused comments. Compared with existing approaches, our generated comments are more informative, and our models can adapt to different code contexts, making the generation process more interpretable. We discuss the envisioned application of our approach and challenges for future work to tackle, including identifying more essential code contexts automatically, constructing more effective prompts, etc.
Purpose This paper aims to use the concept of machine learning to enable people and machines to interact more certainly to extend and expand human expertise and cognition. Design/methodology/approach Intelligent code ...
详细信息
Purpose This paper aims to use the concept of machine learning to enable people and machines to interact more certainly to extend and expand human expertise and cognition. Design/methodology/approach Intelligent code reuse recommendations based on code big data analysis, mining and learning can effectively improve the efficiency and quality of software reuse, including common code units in a specific field and common code units that are not related to the field. Findings Focusing on the topic of context-based intelligent code reuse recommendation, this paper expounds the research work in two aspects mainly in practical applications of smart decision support and cognitive adaptive systems: code reuse recommendation based on template mining and code reuse recommendation based on deep learning. Originality/value On this basis, the future development direction of intelligent code reuse recommendation based on context has prospected.
Template-based automatic program repair (APR) using pre-defined fix patterns for generating patches is common in APR literature and is implemented in many APR tools. As the existing APR tools can fix only a limited nu...
详细信息
ISBN:
(纸本)9798350302141
Template-based automatic program repair (APR) using pre-defined fix patterns for generating patches is common in APR literature and is implemented in many APR tools. As the existing APR tools can fix only a limited number of bugs, automatic mining fix patterns from human-written fixes is an appropriate approach to expand the fix pattern set at low cost. On the other hand, a large number of fix patterns do not ensure that a template-based APR tool will become more effective if the buggy context describing the condition for choosing a fix pattern to generate fixes for a buggy code is too generic. Because the state-of-the-art fix patterns just provide a generic code context for matching a fix pattern, the number of matched patterns and therefore the number of generated patches is exceptionally large. Therefore, we mine human-written fixes automatically to not only extract new fix patterns but also clarify the buggy contexts of the bug fix instances forming each fix pattern. Different from the previous works, we propose to use test execution information, such as exception types and execution paths, in addition to code contexts, as the buggy contexts of the fix patterns. For mining strategy, we choose an agglomerative hierarchical clustering algorithm with our custom distance metric to group the bug fix instances that make the same or similar changes into the same cluster. The evaluation result on the bug set of two APR benchmarks Defects4J and Bears reveals that the fix patterns extracted from the clusters containing at least two similar bug fix instances are matched with most of the state-ofthe-art fix patterns. Moreover, 11 new fix patterns are found. The buggy contexts, including the code context and the test execution information, of all bug fix instances forming each fix pattern are also clarified. Our fix patterns with the new buggy context information are expected to help the template-based APR tools generate and verify patches more effectively.
Links between issue reports and corresponding code commits to fix them can greatly reduce the maintenance costs of a software project. More often than not, however, these links are missing and thus cannot be fully uti...
详细信息
ISBN:
(纸本)9781728105918
Links between issue reports and corresponding code commits to fix them can greatly reduce the maintenance costs of a software project. More often than not, however, these links are missing and thus cannot be fully utilized by developers. Current practices in issue-commit link recovery extract text features and code features in terms of textual similarity from issue reports and commit logs to train their models. These approaches are limited since semantic information could be lost. Furthermore, few of them consider the effect of source code files related to a commit on issue-commit link recovery, let alone the semantics of code context. To tackle these problems, we propose to construct code knowledge graph of a code repository and generate embeddings of source code files to capture the semantics of code context. We also use embeddings to capture the semantics of issue-or commit-related text. Then we use these embeddings to calculate semantic similarity and code similarity using a deep learning approach before training a SVM binary classification model with additional features. Evaluations on real-world projects show that our approach DeepLink can outperform the state-of-the-art method.
暂无评论