检索结果-内蒙古大学图书馆

32nd IEEE International Symposium on Software Reliability Engineering (ISSRE)

作者： Le, Xuan-Bach D. Le, Quang Loc Univ Melbourne Sch Comp & Informat Syst Melbourne Vic Australia UCL London England

ISBN: (纸本)9781665425872

Software programs evolve naturally as part of the ever-changing customer needs and fast-paced market. Software evolution, however, often introduces regression bugs, which unduly break previously working functionalities of the software. To repair regression bugs, one needs to know when and where a bug emerged from, e.g., the bug-inducing code changes, to narrow down the search space. Unfortunately, existing state-of-the-art automated program repair (APR) techniques have not yet fully exploited this information, rendering them less efficient and effective to navigate through a potentially large search space containing many plausible but incorrect solutions. In this work, we revisit APR on repairing regression errors in java programs. We empirically show that existing state-of-the-art APR techniques do not perform well on regression bugs due to their algorithm design and lack of knowledge on bug inducing changes. We subsequently present REFIXAR, a novel repair technique that leverages software evolution history to generate high quality patches for java regression bugs. The key novelty that empowers REFIXAR to more efficiently and effectively traverse the search space is two-fold: (1) A systematic way for multi-version reasoning to capture how a software evolves through its history, and (2) A novel search algorithm over a set of generic repair templates, derived from the principle of incorrectness logic and informed by both past bug fixes and their bug-inducing code changes;this enables REFIXAR to achieve a balance of both genericity and specificity, i.e., generic common fix patterns of bugs and their specific contexts. We compare REFIXAR against the state-of-the-art APR techniques on a data set of 51 real regression bugs from 28 large real-world programs. Experiments show that REFIXAR significantly outperforms the best baseline by a large margin, i.e., REFIXAR can fix correctly 24 bugs while the best baseline can only correctly fix 9 bugs.

关键词： java programming language

来源：评论

学校读者我要写书评

暂无评论

Using Knowledge Units of programming languages to Recommend Reviewers for Pull Requests: An Empirical Study

arXiv

引用

arXiv 2023年

作者： Ahasanuzzaman, Md Oliva, Gustavo A. Hassan, Ahmed E. School of Computing Queen’s University KingstonON Canada

Code review is a key element of quality assurance in software development. Determining the right reviewer for a given code change requires understanding the characteristics of the changed code, identifying the skills of each potential reviewer (expertise profile), and finding a good match between the two. To facilitate this task, we design a code reviewer recommender that operates on the knowledge units (KUs) of a programming language. We define a KU as a cohesive set of key capabilities that are offered by one or more building blocks of a given programming language. We operationalize our KUs using certification exams for the java programming language. We detect KUs from 10 actively maintained java projects from GitHub, spanning 290K commits and 65K pull requests (PRs). Next, we generate developer expertise profiles based on the detected KUs. Finally, these KU-based expertise profiles are used to build a code reviewer recommender (KUREC). The key assumption of KUREC is that the code reviewers of a given PR should be experts in the KUs that appear in the changed files of that PR. In RQ1, we compare KUREC’s performance to that of four baseline recommenders: (i) a commit-frequency-based recommender (CF), (ii) a review-frequency-based recommender (RF), (iii) a modification-expertise-based recommender (ER), and (iv) a review-history-based recommender (CHREV). We observe that KUREC performs as well as the top-performing baseline recommender (RF). From a practical standpoint, we highlight that KUREC’s performance is more stable (lower interquartile range) than that of RF, thus making it more consistent and potentially more trustworthy. Next, in RQ2 we design three new recommenders by combining KUREC with our baseline recommenders. These new combined recommenders outperform both KUREC and the individual baselines. Finally, in RQ3 we evaluate how reasonable the recommendations from KUREC and the combined recommenders are when those deviate from the ground truth. KUREC is the

关键词： java programming language

来源：评论

学校读者我要写书评

暂无评论

How Do java Developers Reuse StackOverflow Answers in Their GitHub Projects?

arXiv

引用

arXiv 2023年

作者： Chen, Juntong Zhao, Yan Meng, Na Virginia Tech United States Eastern Michigan University United States

StackOverflow (SO) is a widely used question-and-answer (Q&A) website for software developers and computer scientists. GitHub is an online development platform used for storing, tracking, and collaborating on software projects. Prior work relates the information mined from both platforms to link user accounts or compare developers' activities across platforms. However, not much work is done to characterize the SO answers reused by GitHub projects. For this paper, we did an empirical study by mining the SO answers reused by java projects available on GitHub. We created a hybrid approach of clone detection, keyword-based search, and manual inspection, to identify the answer(s) actually leveraged by developers. Based on the identified answers, we further studied topics of the discussion threads, answer characteristics (e.g., scores, ages, code lengths, and text lengths), and developers' reuse practices. We observed that most reused answers offer programs to implement specific coding tasks. Among all analyzed SO discussion threads, the reused answers often have relatively higher scores, older ages, longer code, and longer text than unused answers. In only 9% of scenarios (40/430), developers fully copied answer code for reuse. In the remaining scenarios, they reused partial code or created brand new code from scratch. Our study characterized 130 SO discussion threads referred to by java developers in 357 GitHub projects. Our observations can guide SO answerers to provide better answers, and shed lights on future human-centric software engineering research that creates better tools to facilitate reliable and responsible code reuse. © 2023, CC BY.

关键词： java programming language

来源：评论

学校读者我要写书评

暂无评论

Code2tree: A Method for Automatically Generating Code Comments

引用

SCIENTIFIC programming 2022年第0期2022卷

作者： Wen, Wanzhi Chu, Jiawei Zhao, Tian Zhang, Ruinian Zhi, Bao Shen, Chenqiang Nantong Univ Sch Informat Sci & Tech Nantong 226019 Peoples R China

Source code comments can improve the efficiency of software development and maintenance. However, due to the heterogeneity of natural language and program language, the quality of code comments is not so high. So, this paper proposes a novel method Code2tree, which is based on the encoder-decoder model to automatically generate java code comments. Code2tree firstly converts java source code into abstract syntax tree (AST) sequences, and then the AST sequences are encoded by GRU encoder to solve the long sequence learning dependency problem. Finally, the attention mechanism is introduced in the decoding stage, and the quality of the code comment is improved by increasing the weight of the key information. We use the open dataset java-small to train the model and verify the effectiveness of Code2tree based on common-used indicators BLEU and F1-Score.

关键词： java programming language

来源：评论

学校读者我要写书评

暂无评论

Towards a Dataset of programming Contest Plagiarism in java

arXiv

引用

arXiv 2023年

作者： Slobodkin, Evgeniy Sadovnikov, Alexander Sirius.Courses Moscow Russia

In this paper, we describe and present the first dataset of source code plagiarism specifically aimed at contest plagiarism. The dataset contains 251 pairs of plagiarized solutions of competitive programming tasks in java, as well as 660 non-plagiarized ones, however, the described approach can be used to extend the dataset in the future. Importantly, each pair comes in two versions: (a) "raw" and (b) with participants’ repeated template code removed, allowing for evaluating tools in different settings. We used the collected dataset to compare the available source code plagiarism detection tools, including state-of-the-art ones, specifically in their ability to detect contest plagiarism. Our results indicate that the tools show significantly worse performance on the contest plagiarism because of the template code and the presence of other misleadingly similar code. Of the tested tools, token-based ones demonstrated the best performance in both variants of the dataset. Copyright © 2023, The Authors. All rights reserved.

关键词： java programming language

来源：评论

学校读者我要写书评

暂无评论

Paloma: A Benchmark for Evaluating language Model Fit

arXiv

引用

arXiv 2023年

作者： Magnusson, Ian Bhagia, Akshita Hofmann, Valentin Soldaini, Luca Jha, Ananya Harsh Tafjord, Oyvind Schwenk, Dustin Walsh, Evan Pete Elazar, Yanai Lo, Kyle Groeneveld, Dirk Beltagy, Iz Hajishirzi, Hannaneh Smith, Noah A. Richardson, Kyle Dodge, Jesse Allen Institute for Artificial Intelligence United States Paul G. Allen School of Computer Science & Engineering University of Washington United States

Evaluations of language models (LMs) commonly report perplexity on monolithic data held out from training. Implicitly or explicitly, this data is composed of domains—varying distributions of language. We introduce PERPLEXITY ANALYSIS FOR language MODEL ASSESSMENT (PALOMA)1, a benchmark to measure LM fit to 546 English and code domains, instead of assuming perplexity on one distribution extrapolates to others. We include two new datasets of the top 100 subreddits (e.g., r/depression on Reddit) and programming languages (e.g., java on GitHub), both sources common in contemporary LMs. With our benchmark, we release 6 baseline 1B LMs carefully controlled to provide fair comparisons about which pretraining corpus is best and code for others to apply those controls to their own experiments. Our case studies demonstrate how the fine-grained results from PALOMA surface findings such as that models pretrained without data beyond Common Crawl exhibit anomalous gaps in LM fit to many domains or that loss is dominated by the most frequently occurring strings in the vocabulary. © 2023, CC BY.

关键词： java programming language

来源：评论

学校读者我要写书评

暂无评论

How Effective Are Neural Networks for Fixing Security Vulnerabilities

arXiv

引用

arXiv 2023年

作者： Wu, Yi Jiang, Nan Pham, Hung Viet Lutellier, Thibaud Davis, Jordan Tan, Lin Babkin, Petr Shah, Sameena Purdue University West Lafayette United States University of Alberta Camrose Canada York University Toronto Canada J.P. Morgan AI Research Palo Alto United States

Security vulnerability repair is a difficult task that is in dire need of automation. Two groups of techniques have shown promise: (1) large code language models (LLMs) that have been pre-trained on source code for tasks such as code completion, and (2) automated program repair (APR) techniques that use deep learning (DL) models to automatically fix software bugs. This paper is the first to study and compare java vulnerability repair capabilities of LLMs and DL-based APR models. The contributions include that we (1) apply and evaluate five LLMs (Codex, CodeGen, CodeT5, PLBART and InCoder), four fine-tuned LLMs, and four DL-based APR techniques on two real-world java vulnerability benchmarks (Vul4J and VJBench), (2) design code transformations to address the training and test data overlapping threat to Codex, (3) create a new java vulnerability repair benchmark VJBench, and its transformed version VJBench-trans, to better evaluate LLMs and APR techniques, and (4) evaluate LLMs and APR techniques on the transformed vulnerabilities in VJBench-trans. Our findings include that (1) existing LLMs and APR models fix very few java vulnerabilities. Codex fixes 10.2 (20.4%), the most number of vulnerabilities. Many of the generated patches are uncompilable patches. (2) Fine-tuning with general APR data improves LLMs’ vulnerability-fixing capabilities. (3) Our new VJBench reveals that LLMs and APR models fail to fix many Common Weakness Enumeration (CWE) types, such as CWE-325 Missing cryptographic step and CWE-444 HTTP request smuggling. (4) Codex still fixes 8.7 transformed vulnerabilities, outperforming all the other LLMs © 2023, CC BY-NC-ND.

关键词： java programming language

来源：评论

学校读者我要写书评

暂无评论

Improving java Deserialization Gadget Chain Mining via Overriding-Guided Object Generation

arXiv

引用

arXiv 2023年

作者： Cao, Sicong Sun, Xiaobing Wu, Xiaoxue Bo, Lili Li, Bin Wu, Rongxin Liu, Wei He, Biao Ouyang, Yu Li, Jiajia Yangzhou University China Xiamen University China Ant Group China

java (de)serialization is prone to causing security-critical vulnerabilities that attackers can invoke existing methods (gadgets) on the application’s classpath to construct a gadget chain to perform malicious behaviors. Several techniques have been proposed to statically identify suspicious gadget chains and dynamically generate injection objects for fuzzing. However, due to their incomplete support for dynamic program features (e.g., java runtime polymorphism) and ineffective injection object generation for fuzzing, the existing techniques are still far from satisfactory. In this paper, we first performed an empirical study to investigate the characteristics of java deserialization vulnerabilities based on our manually collected 86 publicly known gadget chains. The empirical results show that 1) java deserialization gadgets are usually exploited by abusing runtime polymorphism, which enables attackers to reuse serializable overridden methods;and 2) attackers usually invoke exploitable overridden methods (gadgets) via dynamic binding to generate injection objects for gadget chain construction. Based on our empirical findings, we propose a novel gadget chain mining approach, GCMiner, which captures both explicit and implicit method calls to identify more gadget chains, and adopts an overriding-guided object generation approach to generate valid injection objects for fuzzing. The evaluation results show that GCMiner significantly outperforms the state-of-the-art techniques, and discovers 56 unique gadget chains that cannot be identified by the baseline approaches. Copyright © 2023, The Authors. All rights reserved.

关键词： java programming language

来源：评论

学校读者我要写书评

暂无评论

Authoring Worked Examples for java programming with Human-AI Collaboration

arXiv

引用

arXiv 2023年

作者： Hassany, Mohammad Brusilovsky, Peter Ke, Jiaze Akhuseyinoglu, Kamil Narayanan, Arun Balajiee Lekshmi University of Pittsburgh PittsburghPA United States Carnegie Mellon University PittsburghPA United States

Worked examples (solutions to typical programming problems presented as a source code in a certain language and are used to explain the topics from a programming class) are among the most popular types of learning content in programming classes. Most approaches and tools for presenting these examples to students are based on line-by-line explanations of the example code. However, instructors rarely have time to provide line-by-line explanations for a large number of examples typically used in a programming class. In this paper, we explore and assess a human-AI collaboration approach to authoring worked examples for java programming. We introduce an authoring system for creating java worked examples that generates a starting version of code explanations and presents it to the instructor to edit if necessary. We also present a study that assesses the quality of explanations created with this approach. © 2023, CC BY-NC-ND.

关键词： java programming language

来源：评论

学校读者我要写书评

暂无评论

Montsalvat: Intel SGX Shielding for GraalVM Native Images

arXiv

引用

arXiv 2023年

作者： Yuhala, Peterson Ménétrey, Jämes Felber, Pascal Schiavoni, Valerio Tchana, Alain Thomas, Gaël Guiroux, Hugo Lozi, Jean-Pierre University of Neuchâtel Neuchâtel Switzerland ENS Lyon France Télécom SudParis Institut Polytechnique de Paris France Oracle Labs Zürich Switzerland

The popularity of the java programming language has led to its wide adoption in cloud computing infrastructures. However, java applications running in untrusted clouds are vulnerable to various forms of privileged attacks. The emergence of trusted execution environments (TEEs) such as Intel SGX mitigates this problem. TEEs protect code and data in secure enclaves inaccessible to untrusted software, including the kernel and hypervisors. To efficiently use TEEs, developers must manually partition their applications into trusted and untrusted parts, in order to reduce the size of the trusted computing base (TCB) and minimise the risks of security vulnerabilities. However, partitioning applications poses two important challenges: (i) ensuring efficient object communication between the partitioned components, and (ii) ensuring the consistency of garbage collection between the parts, especially with memory-managed languages such as java. We present Montsalvat, a tool which provides a practical and intuitive annotation-based partitioning approach for java applications destined for secure enclaves. Montsalvat provides an RMI-like mechanism to ensure inter-object communication, as well as consistent garbage collection across the partitioned components. We implement Montsalvat with GraalVM native-image, a tool for compiling java applications ahead-of-time into standalone native executables that do not require a JVM at runtime. Our extensive evaluation with micro- and macro-benchmarks shows our partitioning approach to boost performance in real-world applications up to 6.6× (PalDB) and 2.2× (GraphChi) as compared to solutions that naively include the entire applications in the enclave. © 2023, CC BY.

关键词： java programming language

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：