检索结果-内蒙古大学图书馆

E-PRedictor: an approach for early prediction of pull request acceptance

Science China(Information Sciences) 2025年第5期68卷 380-395页

作者： Kexing CHEN Lingfeng BAO Xing HU Xin XIA Xiaohu YANG State Key Laboratory of Blockchain and Data Security Zhejiang University Software Engineering Application Technology Lab

A pull request(PR) is an event in Git where a contributor asks project maintainers to review code he/she wants to merge into a project. The PR mechanism greatly improves the efficiency of distributed software development in the opensource community. Nevertheless, the massive number of PRs in an open-source software(OSS) project increases the workload of developers. To reduce the burden on developers, many previous studies have investigated factors that affect the chance of PRs getting accepted and built prediction models based on these factors. However, most prediction models are built on the data after PRs are submitted for a while(e.g., comments on PRs), making them not useful in practice. Because integrators still need to spend a large amount of effort on inspecting PRs. In this study, we propose an approach named E-PRedictor(earlier PR predictor) to predict whether a PR will be merged when it is created. E-PRedictor combines three dimensions of manual statistic features(i.e., contributor profile, specific pull request, and project profile) and deep semantic features generated by BERT models based on the description and code changes of PRs. To evaluate the performance of E-PRedictor, we collect475192 PRs from 49 popular open-source projects on GitHub. The experiment results show that our proposed approach can effectively predict whether a PR will be merged or not. E-PRedictor outperforms the baseline models(e.g., Random Forest and VDCNN) built on manual features significantly. In terms of F1@Merge, F1@Reject, and AUC(area under the receiver operating characteristic curve), the performance of E-PRedictor is 90.1%, 60.5%, and 85.4%, respectively.

关键词： pull request prediction model GitHub

来源：评论

学校读者我要写书评

暂无评论

Ratchet: Retrieval Augmented Transformer for Program Repair 35

Ratchet: Retrieval Augmented Transformer for Program Repair

引用

35th IEEE International Symposium on software Reliability engineering, ISSRE 2024

作者： Wang, Jian Liu, Shangqing Xie, Xiaofei Siow, Jingkai Liu, Kui Li, Yi Singapore Management University Singapore Nanyang Technological University Singapore Software Engineering Application Technology Laboratory Huawei China

ISBN: (纸本)9798350353884

Automated Program Repair (APR) presents the promising momentum of releasing developers from the burden of manual debugging tasks by automatically fixing bugs in various ways. Recent advances in deep learning inspire many works in employing deep learning techniques to fixing buggy programs. However, several challenges remain unaddressed: (1) state-of-the-art fault localization techniques often require additional artifacts, such as bug-triggering test cases or bug reports. These artifacts are not always available in the early development phases;(2) Sequence-to-Sequence model-based APR often requires additional contexts with high quality to generate patches. Yet, it is challenging to identify high-quality contexts that are not common in *** this paper, with the redundancy assumption in program repair, we propose a dual deep learning-based APR tool, RATCHET, for localizing (RATCHET-FL) and repairing (Ratchet-PG) buggy programs. Ratchet-FL localizes buggy statements based on the feature learned by a simple BiLSTM model from the code, without any bug-triggering test cases or bug reports. Ratchet-PG relies on our proposed retrieval augmented transformer to learn the historical patches and generate patches for fixing bugs. We evaluate the effectiveness of Ratchet with in-the-lab DrRepair dataset and in-the-wild dataset Ratchet-DS (curated in this work). Our experimental results show that Ratchet outperforms state-of-the-art deep learning approaches on fault localization with 39.8-96.4% accuracy and patch generation with 18.4-46.4% repair accuracy. © 2024 IEEE.

关键词： Program debugging

来源：评论

学校读者我要写书评

暂无评论

Dual Prompt-Based Few-Shot Learning for Automated Vulnerability Patch Localization 31

Dual Prompt-Based Few-Shot Learning for Automated Vulnerabil...

引用

31st IEEE International Conference on software Analysis, Evolution and Reengineering, SANER 2024

作者： Zhang, Junwei Hu, Xing Bao, Lingfeng Xia, Xin Li, Shanping Zhejiang University The State Key Laboratory of Blockchain and Data Security Hangzhou China Software Engineering Application Technology Lab Huawei China

ISBN: (纸本)9798350330663

Vulnerabilities are disclosed with corresponding patches so that users can remediate them in time. However, there are instances where patches are not released with the disclosed vulnerabilities, causing hidden dangers, especially if dependent software remains uninformed about the affected code repository. Hence, it is crucial to automatically locate security patches for disclosed vulnerabilities among a multitude of commits. Despite the promising performance of existing learning-based localization approaches, they still suffer from the following limitations: (1) They cannot perform well in data scarcity scenarios. Most neural models require extensive datasets to capture the semantic correlations between the vulnerability description and code commits, while the number of disclosed vulnerabilities with patches is limited. (2) They struggle to capture the deep semantic correlations between the vulnerability description and code commits due to inherent differences in semantics and characters between code changes and commit messages. It is difficult to use one model to capture the semantic correlations between vulnerability descriptions and code commits. To mitigate these two limitations, in this paper, we propose a novel security patch localization approach named Prom VPat, which utilizes the dual prompt tuning channel to capture the semantic correlation between vulnerability descriptions and commits, especially in data scarcity (i.e., few-shot) scenarios. We first input the commit message and code changes with the vulnerability description into the prompt generator to generate two new inputs with prompt templates. Then, we adopt a pre-trained language model (i.e., PLM) as the encoder, utilize the prompt tuning method to fine-tune the encoder, and generate two correlation probabilities as the semantic features. In addition, we extract 26 handcrafted features from the vulnerability descriptions and the code commits. Finally, we utilize the attention mechanism to fuse the

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

Investigating White-Box Attacks for On-Device Models 24

Investigating White-Box Attacks for On-Device Models

引用

44th ACM/IEEE International Conference on software engineering, ICSE 2024

作者： Zhou, Mingyi Gao, Xiang Wu, Jing Liu, Kui Sun, Hailong Li, Li Monash University MelbourneVIC Australia Beihang University Beijing China Huawei Software Engineering Application Technology Lab China Beihang University Beijing China Yunnan Key Laboratory of Software Engineering China

ISBN: (纸本)9798400702174

Numerous mobile apps have leveraged deep learning capabilities. However, on-device models are vulnerable to attacks as they can be easily extracted from their corresponding mobile apps. Although the structure and parameters information of these models can be accessed, existing on-device attacking approaches only generate black-box attacks (i.e., indirect white-box attacks), which are less effective and efficient than white-box strategies. This is because mobile deep learning (DL) frameworks like TensorFlow Lite (TFLite) do not support gradient computing (referred to as non-debuggable models), which is necessary for white-box attacking algorithms. Thus, we argue that existing findings may underestimate the harm-fulness of on-device attacks. To validate this, we systematically analyze the difficulties of transforming the on-device model to its debuggable version and propose a Reverse engineering framework for On-device Models (REOM), which automatically reverses the compiled on-device TFLite model to its debuggable version, enabling attackers to launch white-box attacks. Our empirical results show that our approach is effective in achieving automated transformation (i.e., 92.6%) among 244 TFLite models. Compared with previous attacks using surrogate models, REOM enables attackers to achieve higher attack success rates (10.23%→89.03%) with a hun-dred times smaller attack perturbations (1.0→0.01). Our findings emphasize the need for developers to carefully consider their model deployment strategies, and use white-box methods to evaluate the vulnerability of on-device models. Our artifacts11https://***/zhoumingyi/REOM are available. © 2024 ACM.

关键词： Deep learning

来源：评论

学校读者我要写书评

暂无评论

GIRP: Energy-Efficient QoS-Oriented Microservice Resource Provisioning via Multi-Objective Multi-Task Reinforcement Learning

引用

IEEE Transactions on Mobile Computing 2025年第7期24卷 5793-5807页

作者： Yuan, Honggang Wang, Ting Fu, Min Shi, Yuanming East China Normal University MoE Engineering Research Center of Software/Hardware Co-design Technology and Application Shanghai Key Laboratory of Trustworthy Computing Shanghai200062 China National University of Singapore Department of Electrical and Computer Engineering 117583 Singapore ShanghaiTech University School of Information Science and Technology Shanghai201210 China

Microservice architecture has revolutionized web service development by facilitating loosely coupled and independently developable components distributed as containers or virtual machines. While existing studies emphasize end-to-end latency, this paper investigates energy-efficient quality-of-service (QoS)-oriented microservice provisioning, focusing on both QoS satisfaction and power consumption (PC) conservation. We propose the Green and Intelligent Resource Provision (GIRP) architecture, integrating a data-driven energy-latency-aware resource allocation and scheduling manager to balance latency and PC. To reconcile the trade-offs involved, a dual-objective optimization problem is formulated to minimize latency and energy use by selecting proper servers, allocating CPU cores, and determining service replicas. To address challenges with discrete variables, dual objectives, and implicit mappings, we leverage a model-free deep deterministic policy gradient-based reinforcement learning algorithm. Specifically, we develop a multi-task agent via the Multi-gate Mixture-of-Experts model to simultaneously make two separate actions regarding CPU core numbers and service replica numbers, followed by a single-task agent to determine service scheduling. Extensive experiments on the DeathStarBenchmark testbed validate GIRP's effectiveness, demonstrating approximately 52% resource savings and a 43% reduction in PC compared to leading methods like Sinan, Firm, and heuristic-based algorithms. These results highlight GIRP's capability to optimize microservice orchestration by balancing end-to-end latency and power efficiency. © 2025 IEEE. All rights reserved.

关键词： Quality of service

来源：评论

学校读者我要写书评

暂无评论

An Efficient and Privacy-Preserving Spatial Crowdsourcing Protocol from Hash Functions for IoT

引用

IEEE Internet of Things Journal 2025年第12期12卷 19231-19243页

作者： Abla, Parhat Fang, Wan Li, Taotao Deng, Zhihong Xie, Anke South China Normal University School of Software Foshan528225 China Sun Yat-Sen University School of Software Engineering Zhuhai528478 China Guangzhou University School of Mathematics and Information Science Guangzhou510006 China Yunnan Innovation Institute of Beihang University Yunnan Key Laboratory of Blockchain Application Technology Kunming650233 China

The proliferation of smart devices has propelled the advancement of IoT-based spatial crowdsourcing. The issue of location privacy in task allocation for IoT-based spatial crowdsourcing has attracted significant attention. Therefore, the main goals of privacy-preserving spatial crowdsourcing are i) achieving better location privacy for both participants and tasks;ii) achieving better allocation performance, i.e., accuracy and average moving distance. The homomorphic encryption-based approaches can achieve these goals, yet they suffer from heavy computation and large communication overhead. Although the differential privacy-based approaches are very efficient, these approaches leverage allocation performance to achieve better location privacy. Motivated by the deficiencies of these existing approaches, we propose a lightweight hash-based spatial crowdsourcing protocol, which not only protects both task location and participant location from the server but also reduces service providers' computation and communication overhead. Besides, our design is independent of the concrete hash function and thus can be instantiated by any collision-resistant cryptographic hash function. Experiment results demonstrate that our protocol outperforms related works in terms of accuracy and average moving distance. © 2014 IEEE.

关键词： Differential privacy

来源：评论

学校读者我要写书评

暂无评论

Practical Program Repair via Preference-based Ensemble Strategy 24

Practical Program Repair via Preference-based Ensemble Strat...

引用

46th IEEE/ACM International Conference on software engineering, ICSE 2024

作者： Zhong, Wenkang Xu, Tongtong Li, Chuanyi Ge, Jidong Liu, Kui Bissyandé, Tegawendé F. Luo, Bin Ng, Vincent National Key Laboratory for Novel Software Technology Nanjing University Nanjing China Huawei Software Engineering Application Technology Lab Hangzhou China University of Luxembourg Luxembourg Human Language Technology Research Institute University of Texas at Dallas RichardsonTX United States

ISBN: (纸本)9798400702174

To date, over 40 Automated Program Repair (APR) tools have been designed with varying bug-fixing strategies, which have been demonstrated to have complementary performance in terms of being effective for different bug classes. Intuitively, it should be feasible to improve the overall bug-fixing performance of APR via assembling existing tools. Unfortunately, simply invoking all available APR tools for a given bug can result in unacceptable costs on APR execution as well as on patch validation (via expensive testing). Therefore, while assembling existing tools is appealing, it requires an efficient strategy to reconcile the need to fix more bugs and the requirements for practicality. In light of this problem, we propose a Preference-based Ensemble Program Repair framework (P-EPR), which seeks to effectively rank APR tools for repairing different bugs. P-EPR is the first non-learning-based APR ensemble method that is novel in its exploitation of repair patterns as a major source of knowledge for ranking APR tools and its reliance on a dynamic update strategy that enables it to immediately exploit and benefit from newly derived repair results. Experimental results show that P-EPR outperforms existing strategies significantly both in flexibility and effectiveness. © 2024 IEEE Computer Society. All rights reserved.

关键词： Well testing

来源：评论

学校读者我要写书评

暂无评论

DRL-Based Time-Varying Workload Scheduling With Priority and Resource Awareness

引用

IEEE Transactions on Network and Service Management 2025年第3期22卷 2838-2852页

作者： Liu, Qifeng Fan, Qilin Zhang, Xu Li, Xiuhua Wang, Kai Xiong, Qingyu Chongqing University School of Big Data and Software Engineering Chongqing400044 China Chongqing University Key Laboratory of Dependable Service Computing in Cyber Physical Society of Ministry of Education Chongqing400044 China Nanjing University School of Electronic Science and Engineering Nanjing210023 China Haihe Laboratory of Information Technology Application Innovation Tianjin300072 China Harbin Institute of Technology School of Computer Science and Technology Weihai264209 China Shandong Key Laboratory of Industrial Network Security Weihai264209 China

With the proliferation of cloud services and the continuous growth in enterprises' demand for dynamic multi-dimensional resources, the implementation of effective strategy for time-varying workload scheduling has become increasingly significant. In this paper, we propose a deep reinforcement learning (DRL)-based method for time-varying workload scheduling, aiming to allocate resources efficiently across servers in the cluster. Specifically, we integrate a classifier and queue scorer to construct a priority queue that exploits temporal resource utilization patterns across different workload classes. Then, we design parallel graph attention layers to capture the dimensional features and temporal dynamics of cloud server cluster. Moreover, we propose a DRL algorithm to generate scheduling strategies that can adapt to dynamic environments. Validation on real-world traces from Google cluster demonstrates that our method outperforms existing approaches in key metrics of cloud server cluster management. © 2004-2012 IEEE.

关键词： Cloud platforms

来源：评论

学校读者我要写书评

暂无评论

PS3: Precise Patch Presence Test Based on Semantic Symbolic Signature 24

PS3: Precise Patch Presence Test Based on Semantic Symbolic ...

引用

44th ACM/IEEE International Conference on software engineering, ICSE 2024

作者： Zhan, Qi Hu, Xing Li, Zhiyang Xia, Xin Lo, David Li, Shanping Zhejiang University The State Key Laboratory of Blockchain and Data Security Hangzhou China Zhejiang University Hangzhou China Software Engineering Application Technology Lab Huawei China Singapore Management University Singapore

ISBN: (纸本)9798400702174

During software development, vulnerabilities have posed a significant threat to users. Patches are the most effective way to combat vulnerabilities. In a large-scale software system, testing the presence of a security patch in every affected binary is crucial to ensure system security. Identifying whether a binary has been patched for a known vulnerability is challenging, as there may only be small differences between patched and vulnerable versions. Existing approaches mainly focus on detecting patches that are compiled in the same compiler options. However, it is common for developers to compile programs with very different compiler options in different situations, which causes inaccuracy for existing methods. In this paper, we propose a new approach named PS3, referring to precise patch presence test based on semantic-level symbolic signature. PS3 exploits symbolic emulation to extract signatures that are stable under different compiler options. Then PS3 can precisely test the presence of the patch by comparing the signatures between the reference and the target at semantic level. To evaluate the effectiveness of our approach, we constructed a dataset consisting of 3,631 (CVE, binary) pairs of 62 recent CVEs in four C/C++ projects. The experimental results show that PS3 achieves scores of 0.82, 0.97, and 0.89 in terms of precision, recall, and F1 score, respectively. PS3 outperforms the state-of-the-art baselines by improving 33% in terms of F1 score and remains stable in different compiler options. © 2024 ACM.

关键词： software design

来源：评论

学校读者我要写书评

暂无评论

NL2CTL: Automatic Generation of Formal Requirements Specifications via Large Language Models 1

引用

25th International Conference on Formal engineering Methods, ICFEM 2024

作者： Zhao, Mengyan Tao, Ran Huang, Yanhong Shi, Jianqi Qin, Shengchao Yang, Yang National Trusted Embedded Software Engineering Technology Research Center East China Normal University Shanghai China Hardware/Software Co-Design Technology and Application Engineering Research Center East China Normal University Shanghai China Guangzhou Institute of Technology Xidian University Xi’an China ICTT and ISN Laboratory Xidian University Xi’an China Software Engineering Institute East China Normal University Shanghai China

ISBN: (数字)9789819606177

ISBN: (纸本)9789819606160

Reducing the gap between natural language requirements and precise formal specifications is a critical task in requirements engineering. In recent years, requirement engineering is becoming increasingly complex alongside the growing intricacy of system engineering. Most requirements are expressed in natural language, which can be incomplete and ambiguous. However, formal languages with strict semantics can accurately represent certain temporal logic properties and allow for automated verification and analysis. This often limits the application of verification techniques, as writing formal specifications is a manual, error-prone, and time-consuming task. To address this, this paper proposes a framework that leverages Large Language Models (LLMs) to achieve automated conversion of natural language requirements to Computation Tree Logic (CTL). To address the issue of dataset scarcity, we leveraged the interactive and generative capabilities of LLMs. By constructing a random generation algorithm and utilizing prompt engineering, we generated an NL-CTL dataset using LLMs. The generated dataset was then used to fine-tune the T5-Large model, enhancing its generative capacity. To improve generalization, this paper proposes the use of the GPT-3.5 Atomic Proposition (AP) Recognition method, which eliminates the constraints of using the framework across different domains. A series of experimental evaluations showed that the fine-tuned LLM achieved an accuracy of 46.4%, whereas the LLM with few-shot learning using only prompt engineering achieved only 2% accuracy, demonstrating the feasibility of this approach. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：