检索结果-内蒙古大学图书馆

MORepair: Teaching LLMs to Repair Code via Multi-Objective Fine-Tuning

ACM Transactions on software engineering and Methodology 1000年

作者： Boyang Yang Haoye Tian Jiadong Ren Hongyu Zhang Jacques Klein Tegawende Bissyande Claire Le Goues Shunfu Jin School of Information Science and Engineering Yanshan University China School of Computing and Information Systems University of Melbourne Australia School of Big Data and Software Engineering Chongqing University China SnT University of Luxembourg Luxembourg School of Computer Science Carnegie Mellon University USA

Within the realm of software engineering, specialized tasks on code, such as program repair, present unique challenges, necessitating fine-tuning Large language models (LLMs) to unlock state-of-the-art performance. Fine-tuning approaches proposed in the literature for LLMs on program repair tasks generally overlook the need to reason about the logic behind code changes, beyond syntactic patterns in the data. High-performing fine-tuning experiments also usually come at very high computational costs. With MORepair, we propose a novel perspective on the learning focus of LLM fine-tuning for program repair: we not only adapt the LLM parameters to the syntactic nuances of the task of code transformation (objective ➊), but we also specifically fine-tune the LLM with respect to the logical reason behind the code change in the training data (objective ➋). Such a multi-objective fine-tuning will instruct LLMs to generate high-quality *** apply MORepair to fine-tune four open-source LLMs with different sizes and architectures. Experimental results on function-level and repository-level repair benchmarks show that the implemented fine-tuning effectively boosts LLM repair performance by 11.4% to 56.0%. We further show that our fine-tuning strategy yields superior performance compared to the state-of-the-art approaches, including standard fine-tuning, Fine-tune-CoT, and RepairLLaMA.

关键词： Program Repair Fine-tuning Large Language Model Open Source

来源：评论

学校读者我要写书评

暂无评论

Overlapping Aware data Placement Optimizations for LSM Tree-Based Store on ZNS SSDs

引用

ACM Transactions on Architecture and Code Optimization 1000年

作者： Jingcheng Shen Lang Yang Linbo Long Zhenhua Tan Congming Gao Kan Zhong Masao Okita Fumihiko Ino School of Computer Science and Technology Chongqing University of Posts and Telecommunications Chongqing China School of Big Data & Software Engineering Chongqing University Chongqing China Xiamen University Xiamen China Chongqing University Chongqing China School of Information Science and Technology Osaka University Suita Japan

Solid State Drives (SSDs) based on the NVMe Zoned Namespaces (ZNS) interface can notably reduce the costs of address mapping, garbage collection, and over-provisioning by dividing the storage space into multiple zones for sequential writes and random reads. The Log-Structured Merge (LSM) tree, which is extensively used in key-value storage systems, converts random writes to sequential writes, hence a suitable scenario to utilize ZNS SSDs. However, LSM tree associated data significantly varies in lifetime due to the levels and merging mechanisms of the LSM tree. Therefore, without an accurate method to estimate data lifetime, data with disparate lifetimes may be placed in the same zone, thus causing low space utilization and high write amplification within the *** address these issues, the paper proposes two data overlapping aware optimizations to realize intelligent data placement: a zone allocation scheme and a garbage collection scheme. The key technique of these optimizations is an accurate data-lifetime estimation by considering both the associated tree level of the data and the data overlapping ratio between the data and those in the neighboring level. Using the estimation technique, the zone allocation optimization can place data with similar lifetimes in the same zone. Besides, the garbage collection optimization can reclaim zones in an adaptive manner based on overlapping ratios to reduce the amount of data migration. Experimental results demonstrate that the optimization schemes effectively reduce garbage collection-incurred data copy by average factors of 2.11 × and 1.50 × in comparison to a conventional work and a state-of-the-art work, respectively. Consequently, the proposed work successfully alleviates the write amplification effect by 18% and 6%, compared to the conventional work and the state-of-the-art work, respectively.

关键词： ZNS SSDs LSM-tree data placement garbage collection

来源：评论

学校读者我要写书评

暂无评论

Model Pruning-enabled Federated Split Learning for Resource-constrained Devices in Artificial Intelligence Empowered Edge Computing Environment

引用

ACM Transactions on Sensor Networks 1000年

作者： Yongzhe Jia Bowen Liu Xuyun Zhang Fei Dai Arif Khan Lianyong Qi Wanchun Dou State Key Laboratory for Novel Software Technology Department of Computer Science and Technology Nanjing University Nanjing China Department of Computing Macquarie University Sydney Australia College of Big Data and Intelligent Engineering Southwest Forestry University Kunming China M3S Empirical Software Engineering Research Unit University of Oulu Oulu Finland College of Computer Science and Technology China University of Petroleum East China - Qingdao Campus Qingdao China

Distributed Collaborative Machine Learning (DCML) has emerged in artificial intelligence-empowered edge computing environments, such as the Industrial Internet of Things (IIoT), to process tremendous data generated by smart devices. However, parallel DCML frameworks require resource-constrained devices to update the entire Deep Neural Network (DNN) models and are vulnerable to reconstruction attacks. Concurrently, the serial DCML frameworks suffer from training efficiency problems due to their serial training nature. In this paper, we propose a Model Pruning-enabled Federated Split Learning framework (MP-FSL) to reduce resource consumption with a secure and efficient training scheme. Specifically, MP-FSL compresses DNN models by adaptive channel pruning and splits each compressed model into two parts that are assigned to the client and the server. Meanwhile, MP-FSL adopts a novel aggregation algorithm to aggregate the pruned heterogeneous models. We implement MP-FSL with a real FL platform to evaluate its performance. The experimental results show that MP-FSL outperforms the state-of-the-art frameworks in model accuracy by up to 1.35%, while concurrently reducing storage and computational resource consumption by up to 32.2% and 26.73%, respectively. These results demonstrate that MP-FSL is a comprehensive solution to the challenges faced by DCML, with superior performance in both reduced resource consumption and enhanced model performance.

关键词： Federated learning split learning model pruning edge computing

来源：评论

学校读者我要写书评

暂无评论

Cheetah: Accelerating Dynamic Graph Mining with Grouping Updates

引用

ACM Transactions on Architecture and Code Optimization 1000年

作者： Yi Zhang Xiaomeng Yi Yu Huang Jingrui Yuan Chuangyi Gui Dan Chen Long Zheng Jianhui Yue Xiaofei Liao Hai Jin Jingling Xue National Engineering Research Center for Big Data Technology and System Service Computing Technology and System Lab Cluster and Grid Computing Lab School of Computer Science and Technology Huazhong University of Science and Technology Wuhan China Zhejiang Lab Hangzhou China National Engineering Research Center for Big Data Technology and System Service Computing Technology and System Lab Cluster and Grid Computing Lab School of Software Engineering Huazhong University of Science and Technology Wuhan China Eastern Institute of Technology Ningbo China National University of Singapore Singapore Singapore Michigan Technological University Houghton United States School of Computer Science and Engieering UNSW Sydney Kensington Australia

Graph pattern mining is essential for deciphering complex networks. In the real world, graphs are dynamic and evolve over time, necessitating updates in mining patterns to reflect these changes. Traditional methods use fine-grained incremental computation to avoid full re-mining after each update, which improves speed but often overlooks potential gains from examining inter-update interactions holistically, thus missing out on overall efficiency *** this paper, we introduce Cheetah, a dynamic graph mining system that processes updates in a coarse-grained manner by leveraging exploration domains. These domains exploit the community structure of real-world graphs to uncover data reuse opportunities typically missed by existing approaches. Exploration domains, which encapsulate extensive portions of the graph relevant to updates, allow multiple updates to explore the same regions efficiently. Cheetah dynamically constructs these domains using a management module that identifies and maintains areas of redundancy as the graph changes. By grouping updates within these domains and employing a neighbor-centric expansion strategy, Cheetah minimizes redundant data accesses. Our evaluation of Cheetah across five real-world datasets shows it outperforms current leading systems by an average factor of 2.63 ×.

关键词： Graph mining system dynamic graph incremental computing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：