检索结果-内蒙古大学图书馆

CARL: Unsupervised Code-Based Adversarial Attacks for programming language models via Reinforcement Learning

ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY 2025年第1期34卷 1-32页

作者： Yao, Kaich un Wang, Hao Qin, Chuan Zh, Hengshu Wu, Yanjun Zhang, Libo Chinese Acad Sci Inst Software Beijing Peoples R China Univ Chinese Acad Sci Sch Comp Sci Beijing Peoples R China Tsinghua Univ PBC Sch Finance Beijing Peoples R China Chinese Acad Sci Comp Network Informat Ctr Beijing Peoples R China

Code based adversarial attacks play a crucial role in revealing vulnerabilities of software system. Recently, pretrained programming language models (PLMs) have demonstrated remarkable success in various significant software engineering tasks, progressively transforming the paradigm of software development. Despite their impressive capabilities, these powerful models are vulnerable to adversarial attacks. Therefore, it is necessary to carefully investigate the robustness and vulnerabilities of the PLMs by means of adversarial attacks. Adversarial attacks entail imperceptible input modifications that cause target models to make incorrect predictions. Existing approaches for attacking PLMs often employ either identifier renaming or the greedy algorithm, which may yield sub-optimal performance or lead to high inference times. In response to these limitations, we propose CARL, an unsupervised black-box attack model that leverages reinforcement learning to generate imperceptible adversarial examples. Specifically, CARL comprises a programming language encoder and a perturbation prediction layer. In order to achieve more effective and efficient attack, we cast the task as a sequence decision-making process, optimizing through policy gradient with a suite of reward functions. We conduct extensive experiments to validate the effectiveness of CARL on code summarization, code translation, and code refinement tasks, covering various programming languages and PLMs. The experimental results demonstrate that CARL surpasses state-of-the-art code attack models, achieving the highest attack success rate across multiple tasks and PLMs while maintaining high attack efficiency, imperceptibility, consistency, and fluency.

关键词： Code Adversarial Attacks programming language models Reinforcement Learning

来源：评论

学校读者我要写书评

暂无评论

Comparing programming language models for Design Pattern Recognition 21

Comparing Programming Language Models for Design Pattern Rec...

引用

21st International Conference on Software Architecture (ICSA)

作者： Pandey, Sushant Kumar Staron, Miroslaw Horkoff, Jennifer Ochodek, Miroslaw Durisic, Darko Chalmers Univ Gothenburg Dept CSE Gothenburg Sweden Poznan Univ Tech Dept Software Engn Poznan Poland Volvo Cars Res & Dev Gothenburg Sweden

ISBN: (纸本)9798350366266;9798350366259

Design patterns (DPs) facilitate effective software architecture and design and must be maintained and enforced in existing complex software products, for example, automotive software. Implementing DPs in source code facilitates the development of high-quality software products with less effort. However, recognizing DPs in program code is challenging, and this makes it difficult to keep architectural evolution under control in large software products over time. As DPs are abstract solutions, the programs used to recognize them in source code have significant limitations. In this paper, we employ four programming language models based on Bidirectional Encoder Representations from Transformers (BERT) to study to which extent these models can recognize an exemplar DP, in this case, Singleton. We compare four language representation models - OpenAI CodeX, Facebook AI TransCoder, ACoRA/BERT, and CCFlex/bag-of-words, and compare the models' rankings to a simple base metric. We found a discrepancy between models in identifying Singletons and found that the models are inconsistently sensitive to name and semantic changes. Specifically, CodeX recognizes the existence of Singletons better than other models, while only ACoRA shows some signs of recognizing DP semantics.

关键词： programming language models design patterns recognition Deep learning NLP

来源：评论

学校读者我要写书评

暂无评论

Comparing Word-Based and AST-Based models for Design Pattern Recognition 19

Comparing Word-Based and AST-Based Models for Design Pattern...

引用

19th International Conference on Predictive models and Data Analytics in Software Engineering (PROMISE)

作者： Chand, Sivajeet Pandey, Sushant Kumar Horkoff, Jennifer Staron, Miroslaw Ochodek, Miroslaw Durisic, Darko Chalmers Univ Gothenburg Gothenburg Sweden Poznan Univ Poznan Poland Volvo Cars Gothenburg Sweden

ISBN: (纸本)9798400703751

Design patterns (DPs) provide reusable and general solutions for frequently encountered problems. Patterns are important to maintain the structure and quality of software products, in particular in large and distributed systems like automotive software. Modern language models (like Code2Vec or Word2Vec) indicate a deep understanding of programs, which has been shown to help in such tasks as program repair or program comprehension, and therefore show promise for DPR in industrial contexts. The models are trained in a self-supervised manner, using a large unlabelled code base, which allows them to quantify such abstract concepts as programming styles, coding guidelines, and, to some extent, the semantics of programs. This study demonstrates how two language models-Code2Vec and Word2Vec, trained on two public automotive repositories, can show the separation of programs containing specific DPs. The results show that the Code2Vec and Word2Vec produce average F1-scores of 0.781 and 0.690 on open-source Java programs, showing promise for DPR in practice.

关键词： programming language models Design Patterns NLP

来源：评论

学校读者我要写书评

暂无评论

Utilization of pre-trained language models for adapter-based knowledge transfer in software engineering

引用

EMPIRICAL SOFTWARE ENGINEERING 2024年第4期29卷 94-94页

作者： Saberi, Iman Fard, Fatemeh Chen, Fuxiang Univ British Columbia 3333 Univ Way Kelowna BC V1V 1V7 Canada Univ Leicester Univ Rd Leicester LE1 7RH England

Software Engineering (SE) Pre-trained language models (PLMs), such as CodeBERT, are pre-trained on large code corpora, and their learned knowledge has shown success in transferring into downstream tasks (e.g., code clone detection) through the fine-tuning of PLMs. In Natural language Processing (NLP), an alternative in transferring the knowledge of PLMs is explored through the use of adapter, a compact and parameter efficient module that is inserted into a PLM. Although the use of adapters has shown promising results in many NLP-based downstream tasks, their application and exploration in SE-based downstream tasks are limited. Here, we study the knowledge transfer using adapters on multiple downstream tasks including cloze test, code clone detection, and code summarization. These adapters are trained on code corpora and are inserted into a PLM that is pre-trained on English corpora or code corpora. We called these PLMs as NL-PLM and C-PLM, respectively. We observed an improvement in results using NL-PLM over a PLM that does not have adapters, and this suggested that adapters can transfer and utilize useful knowledge from NL-PLM to SE tasks. The results are sometimes on par with or exceed the results of C-PLM;while being more efficient in terms of the number of parameters and training time. Interestingly, adapters inserted into a C-PLM generally yield better results than a traditional fine-tuned C-PLM. Our results open new directions to build more compact models for SE tasks.

关键词： Transfer learning Adapter-based Training programming language models Parameter Efficient Finetuning Code Clone Detection Code Summarization

来源：评论

学校读者我要写书评

暂无评论

An agile approach to language modelling and development

引用

INNOVATIONS IN SYSTEMS AND SOFTWARE ENGINEERING 2010年第1-2期6卷 145-153页

作者： Johnstone, Adrian Mosses, Peter D. Scott, Elizabeth Royal Holloway Univ London Dept Comp Sci Egham TW20 0EX Surrey England Swansea Univ Dept Comp Sci Singleton Pk Swansea SA2 8PP W Glam Wales

We have developed novel techniques for component-based specification of programming languages. In our approach, the semantics of each fundamental programming construct is specified independently, using an inherently modular framework such that no reformulation is needed when constructs are combined. Alanguage specification consists of an unrestricted context-free grammar for the syntax of programs, together with an analysis of each language construct in terms of fundamental constructs. An open-ended collection of fundamental constructs is currently being developed. When supported by appropriate tools, our techniques allow a more agile approach to the design, modelling, and implementation of programming and domain-specific languages. In particular, our approach encourages language designers to proceed incrementally, using prototype implementations generated from specifications to test tentative designs. The components of our specifications are independent and highly reusable, so initial language specifications can be rapidly produced, and can easily evolve in response to changing design decisions. In this paper, we outline our approach, and relate it to the practices and principles of agile modelling.

关键词： programming language models Syntax Semantics Agile methods

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：