Context: Software testing ensures software quality, but developers often disregard it. The use of automated testing generation is pursued to reduce the consequences of overlooked test cases in a software project. Prob...
详细信息
ISBN:
(纸本)9798400716294
Context: Software testing ensures software quality, but developers often disregard it. The use of automated testing generation is pursued to reduce the consequences of overlooked test cases in a software project. Problem: In the context of Java programs, several tools can completely automate generating unit test sets. Additionally, studies are conducted to offer evidence regarding the quality of the generated test sets. However, it is worth noting that these tools rely on machine learning and other AI algorithms rather than incorporating the latest advancements in Large Language Models (LLMs). Solution: This work aims to evaluate the quality of Java unit tests generated by an OpenAI LLM algorithm, using metrics like code coverage and mutation test score. Method: For this study, 33 programs used by other researchers in the field of automated test generation were selected. This approach was employed to establish a baseline for comparison purposes. For each program, 33 unit test sets were generated automatically, without human interference, by changing Open AI API parameters. After executing each test set, metrics such as code line coverage, mutation score, and success rate of test execution were collected to evaluate the efficiency and effectiveness of each set. Summary of Results: Our findings revealed that the OpenAI LLM test set demonstrated similar performance across all evaluated aspects compared to traditional automated Java test generation tools used in the previous research. These results are particularly remarkable considering the simplicity of the experiment and the fact that the generated test code did not undergo human analysis.
Entailment is an important problem in computational logic particularly relevant to the Inductive Logic Programming (ILP) community as it is at the core of the hypothesis coverage test which is often the bottleneck of ...
详细信息
ISBN:
(纸本)9783642212949;9783642212956
Entailment is an important problem in computational logic particularly relevant to the Inductive Logic Programming (ILP) community as it is at the core of the hypothesis coverage test which is often the bottleneck of an ILP system. Despite developments in resolution heuristics and, more recently, in subsumption engines, most ILP systems simply use Prolog's left-to-right, depth-first search selection function for SLD-resolution to perform the hypothesis coverage test. We implemented two alternative selection functions for SLD-resolution: smallest predicate domain (SPD) and smallest variable domain (SVD);and developed a subsumption engine, Subsumer. These entailment engines were fully integrated into the ILP system ProGolem. The performance of these four entailment engines is compared on a representative set of ILP datasets. As expected, on determinate datasets Prolog's built-in resolution, is unrivalled. However, in the presence of even little non-determinism, its performance quickly degrades and a sophisticated entailment engine is required.
Software is getting complicated due to the changing needs and flourishing development of software industry. To better improve software quality, we have to find the major reasons which cause the program crash. However,...
详细信息
ISBN:
(纸本)9781509055692
Software is getting complicated due to the changing needs and flourishing development of software industry. To better improve software quality, we have to find the major reasons which cause the program crash. However, debugging by software developer is not an efficient method, especially in large software. Many automated tools are developed to enhance the fault localization efficiency and reduce the maintenance cost. Most researches focus on improving the software testing process, and the primary triage method is based on the stack-trace hash (e.g., smartfuzz, basic fuzzing framework and Failure Observation engine), and is unchanged for a long time. Therefore, we propose a new triage method based on binary block coverage. Our triage method is designed by analyzing the binary level coverage results, on every time the input causes the program crash. For the same crash input, we also use traditional stack-trace hash method to contrast the flaws with our method. Our experiment results reveal that our proposed method based on code coverage exhibits better triages in terms of the number of unique bugs identified and correct classifications of faults.
The Glass Box Test (GBT), also known as White Box Test or Structural Test, shows which parts of the program under test have, or have not, been executed. Many GBT tools are available for almost any programming language...
详细信息
ISBN:
(纸本)9783662448571
The Glass Box Test (GBT), also known as White Box Test or Structural Test, shows which parts of the program under test have, or have not, been executed. Many GBT tools are available for almost any programming language. Industry standards for safety-critical software require a very high or even complete coverage. At first glance, the GBT seems to be a well-established and mature testing technique that is based on standardized metrics. But on closer inspection, there are several serious shortcomings of the underlying models and metrics which lead to very imprecise, inconsistent coverage results of the various GBT tools. In this paper, a new and precise model for the GBT is presented. This model is used as a reference for the precise definition of all the popular coverage metrics that are around. The tool CodeCover which was developed in the University of Stuttgart is an implementation that strictly follows those definitions.
Context: The importance of automotive software has been rapidly increasing because software controls many components of motor vehicles such as smart-key system, fire pressure monitoring system, and advanced driver ass...
详细信息
Context: The importance of automotive software has been rapidly increasing because software controls many components of motor vehicles such as smart-key system, fire pressure monitoring system, and advanced driver assistance system. Consequently, the automotive industry spends a large amount of human effort to test automotive software and is interested in automated testing techniques to ensure high-quality automotive software with reduced human effort. Objective: Applying automated test generation techniques to automotive software is technically challenging because of false alarms caused by imprecise test drivers/stubs and lack of tool supports for symbolic analysis of bit-fields and function pointers in C. To address such challenges, we have developed an automated testing framework MAESTRO. Method: MAESTRO automatically builds a test driver and stubs for a target task (i.e., a software unit consisting of target functions). Then, it generates test inputs to a target task with the test driver and stubs by applying concolic testing and fuzzing together in an adaptive way. In addition, MAESTRO transforms a target program that uses bitfields into a semantically equivalent one that does not use bit-fields. Also, MAESTRO supports symbolic function pointers by identifying the candidate functions of a symbolic function pointer through static analysis. Results: MAESTRO achieved 94.2% branch coverage and 82.3% MC/DC coverage on the four target modules (238 KLOC) developed by Hyundai Mobis. Furthermore, it significantly reduced the cost of coverage testing by reducing the manual effort for coverage testing by 58.8%. Conclusion: By applying automated testing techniques, MAESTRO can achieve high test coverage for automotive software with significantly reduced manual testing effort.
A fundamental area of software engineering that remains a challenge is the delivery of software with the minimum of remaining defects. The principal technique currently used in the software industry for the verificati...
详细信息
ISBN:
(纸本)9780954414511
A fundamental area of software engineering that remains a challenge is the delivery of software with the minimum of remaining defects. The principal technique currently used in the software industry for the verification and validation of software is dynamic software testing where the software under consideration is actually executed using test data. The actual generation of test data for the purpose of automated software testing is still however mainly a manual task. This problem is further compounded for Java programmers because testing criteria can be imposed at the Java Bytecode level rather than at the source level. To alleviate these difficulties an Interactive Bytecode Inspection System (IBIS) has been developed that allows examination of the Java Bytecode and the automatic generation and execution of test data.
暂无评论