检索结果-内蒙古大学图书馆

arXiv 2024年

作者： Cao, Jialun Chen, Songqiang Zhang, Wuqi Lo, Hau Ching Li, Yeting Cheung, Shing-Chi The Hong Kong University of Science and Technology Hong Kong Institute of Software Chinese Academy of Sciences University of Chinese Academy of Sciences China

Data contamination presents a critical barrier preventing widespread industrial adoption of advanced software engineering techniques that leverage code language models (CLMs). This phenomenon occurs when evaluation data inadvertently overlaps with the public code repositories used to train CLMs, severely undermining the credibility of performance evaluations. For software companies considering the integration of CLMbased techniques into their development pipeline, this uncertainty about true performance metrics poses an unacceptable business risk. Code refactoring, which comprises code restructuring and variable renaming, has emerged as a promising measure to mitigate data contamination. It provides a practical alternative to the resource-intensive process of building contamination-free evaluation datasets, which would require companies to collect, clean, and label code created after the CLMs' training cutoff dates. However, the lack of automated code refactoring tools and scientifically validated refactoring techniques has hampered widespread industrial implementation. To bridge the gap, this paper presents the first systematic study to examine the efficacy of code refactoring operators at multiple scales (method-level, class-level, and cross-class level) and in different programming languages. In particular, we develop an open-sourced toolkit, CODECLEANER, which includes 11 operators for Python, with nine method-level, one class-level, and one cross-class level operator. We elaborate on the rationale for why these operators could work to resolve data contamination and use both data-wise (e.g., N-gram matching overlap ratio) and model-wise metrics (e.g., perplexity) to quantify the efficacy after operators are applied. A drop of 65% overlap ratio is found when applying all operators in CODECLEANER, demonstrating their effectiveness in addressing data contamination. Additionally, we migrate four operators to java, showing their generalizability to another language.

关键词： java programming language

来源：评论

学校读者我要写书评

暂无评论

Reasoning About Exceptional Behavior At the Level of java Bytecode

arXiv

引用

arXiv 2024年

作者： Paganoni, Marco Furia, Carlo A. Software Institute USI Università della Svizzera italiana Lugano Switzerland

A program’s exceptional behavior can substantially complicate its control flow, and hence accurately reasoning about the program’s correctness. On the other hand, formally verifying realistic programs is likely to involve exceptions—a ubiquitous feature in modern programming languages. In this paper, we present a novel approach to verify the exceptional behavior of java programs, which extends our previous work on BYTEBACK. BYTEBACK works on a program’s bytecode, while providing means to specify the intended behavior at the source-code level;this approach sets BYTEBACK apart from most state-of-the-art verifiers that target source code. To explicitly model a program’s exceptional behavior in a way that is amenable to formal reasoning, we introduce Vimp: a high-level bytecode representation that extends the Soot framework’s Grimp with verification-oriented features, thus serving as an intermediate layer between bytecode and the Boogie intermediate verification language. Working on bytecode through this intermediate layer brings flexibility and adaptability to new language versions and variants: as our experiments demonstrate, BYTEBACK can verify programs involving exceptional behavior in all versions of java, as well as in Scala and Kotlin (two other popular JVM languages). © 2024, CC BY.

关键词： java programming language

来源：评论

学校读者我要写书评

暂无评论

GitHub Copilot: the perfect Code compLeeter?

arXiv

引用

arXiv 2024年

作者： Siroš, Ilja Singelée, Dave Preneel, Bart COSIC KU Leuven Leuven Belgium

This paper aims to evaluate GitHub Copilot’s generated code quality based on the LeetCode problem set using a custom automated framework. We evaluate the results of Copilot for 4 programming languages: java, C++, Python3 and Rust. We aim to evaluate Copilot’s reliability in the code generation stage, the correctness of the generated code and its dependency on the programming language, problem’s difficulty level and problem’s topic. In addition to that, we evaluate code’s time and memory efficiency and compare it to the average human results. In total, we generate solutions for 1760 problems for each programming language and evaluate all the Copilot’s suggestions for each problem, resulting in over 50000 submissions to LeetCode spread over a 2-month period. We found that Copilot successfully solved most of the problems. However, Copilot was rather more successful in generating code in java and C++ than in Python3 and Rust. Moreover, in case of Python3 Copilot proved to be rather unreliable in the code generation phase. We also discovered that Copilot’s top-ranked suggestions are not always the best. In addition, we analysed how the topic of the problem impacts the correctness rate. Finally, based on statistics information from LeetCode, we can conclude that Copilot generates more efficient code than an average human. © 2024, CC BY.

关键词： java programming language

来源：评论

学校读者我要写书评

暂无评论

Human-AI Co-Creation of Worked Examples for programming Classes

arXiv

引用

arXiv 2024年

作者： Hassany, Mohammad Brusilovsky, Peter Ke, Jiaze Akhuseyinoglu, Kamil Narayanan, Arun Balajiee Lekshmi University of Pittsburgh PittsburghPA15260 United States Carnegie Mellon University PittsburghPA15213 United States

Worked examples (solutions to typical programming problems presented as a source code in a certain language and are used to explain the topics from a programming class) are among the most popular types of learning content in programming classes. Most approaches and tools for presenting these examples to students are based on line-by-line explanations of the example code. However, instructors rarely have time to provide line-by-line explanations for a large number of examples typically used in a programming class. In this paper, we explore and assess a human-AI collaboration approach to authoring worked examples for java programming. We introduce an authoring system for creating java worked examples that generates a starting version of code explanations and presents it to the instructor to edit if necessary. We also present a study that assesses the quality of explanations created with this approach. © 2024, CC BY-NC-ND.

关键词： java programming language

来源：评论

学校读者我要写书评

暂无评论

How do annotations affect java code readability?

arXiv

引用

arXiv 2024年

作者： Guerra, Eduardo Gomes, Everaldo Ferreira, Jeferson Wiese, Igor Lima, Phyllipe Gerosa, Marco Meirelles, Paulo Free University of Bozen-Bolzano Italy University of São Paulo Brazil SONDA Brazil Federal Technological University of Paraná Brazil Federal University of Itajubá Brazil Northern Arizona University United States

Context: Code annotations have gained widespread popularity in programming languages, offering developers the ability to attach metadata to code elements to define custom behaviors. Many modern frameworks and APIs use annotations to keep integration less verbose and located nearer to the corresponding code element. Despite these advantages, practitioners' anecdotal evidence suggests that annotations might negatively affect code readability. Objective: To better understand this effect, this paper systematically investigates the relationship between code annotations and code readability. Method: In a survey with software developers (n=332), we present 15 pairs of java code snippets with and without code annotations. These pairs were designed considering five categories of annotation used in real-world java frameworks and APIs. Survey participants selected the code snippet they considered more readable for each pair and answered an open question about how annotations affect the code's readability. Results: Preferences were scattered for all categories of annotation usage, revealing no consensus among participants. The answers were spread even when segregated by participants' programming or annotation-related experience. Nevertheless, some participants showed a consistent preference in favor or against annotations across all categories, which may indicate a personal preference. Our qualitative analysis of the open-ended questions revealed that participants often praise annotation impacts on design, maintainability, and productivity but expressed contrasting views on understandability and code clarity. Conclusions: Software developers and API designers can consider our results when deciding whether to use annotations, equipped with the insight that developers express contrasting views of the annotations' impact on code readability. Copyright © 2024, The Authors. All rights reserved.

关键词： java programming language

来源：评论

学校读者我要写书评

暂无评论

Less Is More: A Mixed-Methods Study on Security-Sensitive API Calls in java for Better Dependency Selection

arXiv

引用

arXiv 2024年

作者： Rahman, Imranur Paramitha, Ranindya Plate, Henrik Wermke, Dominik Williams, Laurie North Carolina State University United States Università degli Studi di Trento Italy Endor Labs

[Background:] Security-sensitive APIs provide access to security-sensitive resources, e.g., the filesystem or network resources. Including such API calls—directly or through dependencies—increases the application’s attack surface. An example of such a phenomenon is Log4Shell, which rendered many applications vulnerable due to network-related capabilities (JNDI lookup) in log4j package. Before the Log4Shell incident, alternate logging libraries to log4j were available that do not make JNDI lookup calls. [Problem:] The impact of such an incident would be minimal if information about network-related API calls by logging libraries were available to the developers. And so the lack of visibility into the calls to these security-sensitive APIs by functionally similar open-source packages makes it difficult for developers to use them as a dependency selection criterion. [Goal:] The goal of this study is to aid developers in selecting their dependency by understanding security-sensitive APIs in their dependency through call graph analysis. [Methodology:] We conducted a mixed-methods study with 45 java packages and defined a list of 219 security-sensitive APIs. We categorized these 219 APIs into 3 themes and 15 categories. We then used call graph analysis to analyze the prevalence of these APIs in our selected package versions, with and without their dependencies. Finally, we conducted a survey with open-source developers (110 respondents) showing the comparison of functionally similar packages w.r.t. security-sensitive API calls to understand the usefulness of this API information in the dependency selection process. [Result:] The number of security-sensitive API calls of functionally similar packages can vary from 0 to 368 in one API category and 0 to 429 in total. Our survey results show that 73% developers agree that information about the number and type of security-sensitive API calls of functionally similar packages would have been useful in their dependency selection.

关键词： java programming language

来源：评论

学校读者我要写书评

暂无评论

An LLM-based Readability Measurement for Unit Tests’ Context-aware Inputs

arXiv

引用

arXiv 2024年

作者： Zhou, Zhichao Tang, Yutian Lin, Yun He, Jingzhu School of Information Science and Technology ShanghaiTech University China The Shanghai Jiao Tong University China University of Glasgow United Kingdom

Automated test techniques usually generate unit tests with higher code coverage than manual tests. However, the readability of automated tests is crucial for code comprehension and maintenance. The readability of unit tests involves many aspects. In this paper, we focus on test inputs. The central limitation of existing studies on input readability is that they focus on test codes alone without taking the tested source codes into consideration, making them either ignore different source codes’ different readability requirements or require manual efforts to write readable inputs. However, we observe that the source codes specify the contexts that test inputs must satisfy. Based on such observation, we introduce the Context Consistency Criterion (a.k.a, C3), which is a readability measurement tool that leverages Large language Models to extract primitive-type (including string-type) parameters’ readability contexts from the source codes and checks whether test inputs are consistent with those contexts. We have also proposed EvoSuiteC3. It leverages C3’s extracted contexts to help EvoSuite generate readable test inputs. We have evaluated C3’s performance on 409 java classes and compared manual and automated tests’ readability under C3 measurement. The results are two-fold. First, The Precision, Recall, and F1-Score of C3’s mined readability contexts are 84.4%, 83%, and 83.7%, respectively. Second, under C3’s measurement, the string-type input readability scores of EvoSuiteC3, ChatUniTest (an LLM-based test generation tool), manual tests, and two traditional tools (EvoSuite and Randoop) are 90%, 83%, 68%, 8%, and 8%, showing the traditional tools’ inability in generating readable string-type inputs. We have conducted a survey based on the questionnaires collected from 30 programmers with varied backgrounds. The results reveal that when C3 identifies readable differences between tests, programmers tend to give similar opinions of the test’s readability of C3. © 2024, CC

关键词： java programming language

来源：评论

学校读者我要写书评

暂无评论

Multi-Scale Molecular Dynamics Simulations

arXiv

引用

arXiv 2024年

作者： Boussinot, Frédéric

In molecular dynamics (MD), systems are molecules made up of atoms, and the aim is to determine their evolution over time. MD is based on a numerical resolution algorithm, whose role is to apply the forces generated by the various components, according to the equations of Newtonian physics. Molecular Dynamics is currently mainly used in materials science and molecular biology. In this document, we limit ourselves to alkanes which are non-cyclic carbon-hydrogenated chains. In the basic "All-atom" (AA) scale, all the atoms are directly simulated. In the "United-atom" (UA) scale, one considers grains that are composed of a carbon atom with the hydrogen atoms attached to it. Grains in the "Coarse-grained" (CG) scale are composed of two consecutive UA grains. In the multi-scale approach, one tries to use as much as possible the UA and CG scales which can be more efficiently simulated than the AA scale. In this document, we mainly put the focus on three topics. First, we describe an MD system, implemented in the java programming language, according to the Synchronous Reactive programming approach in which there exists a notion of a global logical time. This system is used to simulate molecules and also to build the potentials functions at the UA and CG scales. Second, two methods to derive UA and CG potentials from AA potentials are proposed and analysed. Basically, both methods rely on strong geometrical links with the AA scale. We use these links with AA to determine the forms and values of the UA and CG potentials. In the first method (called "inverse-Boltzmann"), one considers data produced during several AA scale molecule simulations, and one processes these data using a statistical approach. In the second method ("minimisation method"), one applies a constrained-minimisation technique to AA molecules. The most satisfactory method clearly appears to be the minimisation-based one. The UA potentials we have determined have standard forms: they only differ from AA poten

关键词： java programming language

来源：评论

学校读者我要写书评

暂无评论

Breaking-Good: Explaining Breaking Dependency Updates with Build Analysis

arXiv

引用

arXiv 2024年

作者： Reyes, Frank Baudry, Benoit Monperrus, Martin Université de Montréal Montréal Canada KTH Royal Institute of Technology Stockholm Sweden

Dependency updates often cause compilation errors when new dependency versions introduce changes that are incompatible with existing client code. Fixing breaking dependency updates is notoriously hard, as their root cause can be hidden deep in the dependency tree. We present Breaking-Good, a tool that automatically generates explanations for breaking updates. Breaking-Good provides a detailed categorization of compilation errors, identifying several factors related to changes in direct and indirect dependencies, incompatibilities between java versions, and client-specific configuration. With a blended analysis of log and dependency trees, Breaking-Good generates detailed explanations for each breaking update. These explanations help developers understand the causes of the breaking update, and suggest possible actions to fix the breakage. We evaluate Breaking-Good on 243 real-world breaking dependency updates. Our results indicate that Breaking-Good accurately identifies root causes and generates automatic explanations for 70% of these breaking updates. Our user study demonstrates that the generated explanations help developers. Breaking-Good is the first technique that automatically identifies the causes of a breaking dependency update and explains the breakage accordingly. Copyright © 2024, The Authors. All rights reserved.

关键词： java programming language

来源：评论

学校读者我要写书评

暂无评论

Compilation of Commit Changes within java Source Code Repositories

arXiv

引用

arXiv 2024年

作者： Schott, Stefan Fischer, Wolfram Ponta, Serena Elisa Klauke, Jonas Bodden, Eric Paderborn University Paderborn Germany SAP Security Research Mougins France Paderborn University Fraunhofer IEM Paderborn Germany

java applications include third-party dependencies as bytecode. To keep these applications secure, researchers have proposed tools to re-identify dependencies that contain known vulnerabilities. Yet, to allow such re-identification, one must obtain, for each vulnerability patch, the bytecode fixing the respective vulnerability at first. Such patches for dependencies are curated in databases in the form of fix-commits. But fix-commits are in source code, and automatically compiling whole java projects to bytecode is notoriously hard, particularly for non-current versions of the code. In this paper, we thus propose JESS, an approach that largely avoids this problem by compiling solely the relevant code that was modified within a given commit. JESS reduces the code, retaining only those parts that the committed change references. To avoid name-resolution errors, JESS automatically infers stubs for references to entities that are unavailable to the compiler. A challenge is here that, to facilitate the above mentioned re-identification, JESS must seek to produce bytecode that is almost identical to the bytecode which one would obtain by a successful compilation of the full project. An evaluation on 347 GitHub projects shows that JESS is able to compile, in isolation, 72% of methods and constructors, of which 89% have bytecode equal to the original one. Furthermore, on the Project KB database of fix-commits, in which only 8% of files modified within the commits can be compiled with the provided build scripts, JESS is able to compile 73% of all files that these commits modify. © 2024, CC BY.

关键词： java programming language

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：