coverage analysis is widely used but can suffer from high overhead. This overhead is especially acute in the context of Python, which is already notoriously slow (a recent study observes a roughly 30x slowdown vs. nat...
详细信息
ISBN:
(纸本)9798400702211
coverage analysis is widely used but can suffer from high overhead. This overhead is especially acute in the context of Python, which is already notoriously slow (a recent study observes a roughly 30x slowdown vs. native code). We find that the state-of-the-art coverage tool for Python, ***, introduces a median overhead of 180% with the standard Python interpreter. Slowdowns are even more extreme when using PyPy, a JIT-compiled Python implementation, with *** imposing a median overhead of 1,300%. This performance degradation reduces the utility of coverage analysis in most use cases, including testing and fuzzing, and precludes its use in deployment. This paper presents SLIPCOVER, a novel, near-zero overhead coverage analyzer for Python. SLIPCOVER works without modifications to either the Python interpreter or PyPy. It first processes a program's AST to accurately identify all branches and lines. SLIPCOVER then dynamically rewrites Python bytecodes to add lightweight instrumentation to each identified branch and line. At run time, SLIPCOVER periodically de-instruments already-covered lines and branches. The result is extremely low overheads-a median of just 5%-making SLIPCOVER suitable for use in deployment. We show its efficiency can translate to significant increases in the speed of coverage-based clients. As a proof of concept, we integrate SLIPCOVER into TPBT, a targeted property-based testing system, and observe a 22x speedup.
A continuous challenge facing software penetration testers is ensuring adequate coverage of a target application. Many dynamic application security testing tools and manual pen-testing techniques test only part of the...
详细信息
ISBN:
(纸本)9781479917372
A continuous challenge facing software penetration testers is ensuring adequate coverage of a target application. Many dynamic application security testing tools and manual pen-testing techniques test only part of the exposed code base, leaving much of the attack surface untested. A purely black box approach, used by most DAST tools, makes it almost impossible to accurately identify how much of the attack surface of an application was tested for penetration during assessment. Glass box testing techniques, as described in this paper, significantly improve the insight that penetration testers have into the coverage and makeup of the applications they are targeting. This paper reports on DHS-funded research which resulted in an innovative open source tool called code Pulse that provides real-time code coverage for pen-testing Java web applications. code Pulse leverages the Java instrumentation libraries to provide a real-time glass box perspective of method calls as they are exercised during security testing activities. While the concept of glass box testing is not new, code Pulse delivers a novel real-time approach to the challenge while maintaining a tool-agnostic approach. In this paper we will outline the code coverage challenges facing penetration testers, describe the state-of-the-art in software assurance code coverage, the innovative aspects of our approach and its contribution to the state-of-the-art, the feedback we have received since releasing it as an Open Web Application Security Project (OWASP) pen-testing application in May 2014, and the planned improvements to code Pulse.
code coverage analysis has become a standard approach in software development, facilitating the assessment of test suite effectiveness, the identification of under-tested code segments, and the discovery of performanc...
详细信息
ISBN:
(纸本)9798400717918
code coverage analysis has become a standard approach in software development, facilitating the assessment of test suite effectiveness, the identification of under-tested code segments, and the discovery of performance bottlenecks. When code coverage of software for embedded systems needs to be measured, conventional approaches quickly meet their limits. A commonly used approach involves instrumenting the source files with added code that collects and dumps coverage information during runtime. This inserted code usually relies on the existence of an operating and a file system to dump the collected data. These features are not available for bare-metal programs that are executed on embedded systems. To overcome this issue, we present NQC(2), a plugin for QEMU. NQC(2) extracts coverage information from QEMU during runtime and stores them into a file on the host machine. This approach is even compatible with modified QEMU versions and does not require target-software instrumentation. NQC(2) outperforms a comparable approach from Xilinx by up to 8.5 x.
Structural coverage criteria are commonly used to determine the adequacy of a test suite. However, studies investigating structural coverage and fault-finding capabilities have mixed results. Some studies have shown g...
详细信息
ISBN:
(纸本)9781450341516
Structural coverage criteria are commonly used to determine the adequacy of a test suite. However, studies investigating structural coverage and fault-finding capabilities have mixed results. Some studies have shown generating test suites to satisfy structural coverage criteria has a positive effect on finding faults, while other studies show the opposite. These mixed results indicate there are factors not yet known that affect the ability of test suites satisfying structural coverage criteria to find faults. In order to improve the fault-finding capabilities of test suites, it is essential to understand what factors are causing this variance. Unfortunately very little work has been done to investigate the variance observed in the relationship between structural coverage criteria and fault-finding capabilities. In this paper, we investigate one possible source of variation in the results observed: fault type. We provide an empirical study which narrows down the focus of the relationship between structural coverage and fault-finding capabilities by focusing on object-oriented bugs. Specifically, we investigated 26 different types of object-oriented faults and evaluated how effectively test suites with high coverage percentages were able to detect each type of fault. We found that a test suite's ability to find faults varied significantly according to the type of fault (ranging from a rate of 0% to 87.5% mutants detected per fault type). We also found there are particular types of faults that were consistently found less frequently across all object programs.
In this paper, we propose DEEPRL4FL, a deep learning fault localization (FL) approach that locates the buggy code at the statement and method levels by treating FL as an image pattern recognition problem. DEEPRL4FL do...
详细信息
ISBN:
(纸本)9780738113197
In this paper, we propose DEEPRL4FL, a deep learning fault localization (FL) approach that locates the buggy code at the statement and method levels by treating FL as an image pattern recognition problem. DEEPRL4FL does so via novel code coverage representation learning (RL) and data dependencies RL for program statements. Those two types of RL on the dynamic information in a code coverage matrix are also combined with the code representation learning on the static information of the usual suspicious source code. This combination is inspired by crime scene investigation in which investigators analyze the crime scene (failed test cases and statements) and related persons (statements with dependencies), and at the same time, examine the usual suspects who have committed a similar crime in the past (similar buggy code in the training data). For the code coverage information, DEEPRL4FL first orders the test cases and marks error-exhibiting code statements, expecting that a model can recognize the patterns discriminating between faulty and non-faulty statements/methods. For dependencies among statements, the suspiciousness of a statement is seen taking into account the data dependencies to other statements in execution and data flows, in addition to the statement by itself. Finally, the vector representations for code coverage matrix, data dependencies among statements, and source code are combined and used as the input of a classifier built from a Convolution Neural Network to detect buggy statements/methods. Our empirical evaluation shows that DEEPRL4FL improves the top-1 results over the state-of-the-art statement-level FL baselines from 173.1% to 491.7%. It also improves the top-1 results over the existing method-level FL baselines from 15.0% to 206.3%.
The goal of test suite prioritization is maximizing fault detection and code coverage rate. Several nature inspired optimization algorithms such as Swarm Intelligence (SI) have been studied for the optimization of suc...
详细信息
ISBN:
(纸本)9781467386111
The goal of test suite prioritization is maximizing fault detection and code coverage rate. Several nature inspired optimization algorithms such as Swarm Intelligence (SI) have been studied for the optimization of such problems. The studies revealed the benefits of Artificial Bee Colony (ABC) over other algorithms. ABC and its variations were implemented in software testing areas, test suite prioritization in particular. However, most SI based approaches focus on fault detection ability which is difficult to predict. In this paper, the standard ABC algorithm is used to prioritize test suites based on code coverage. The results reveal that ABC shows promising results and, hence, is a great candidate for prioritizing test suites. It also suggests that a modification to the standard ABC algorithm or combination of ABC and another SI algorithm should yield an even better result.
Dynamic analysis has emerged as a pivotal technique for testing Android apps, enabling the detection of bugs, malicious code, and vulnerabilities. A key metric in evaluating the efficacy of tools employed by both rese...
详细信息
ISBN:
(纸本)9798400706585
Dynamic analysis has emerged as a pivotal technique for testing Android apps, enabling the detection of bugs, malicious code, and vulnerabilities. A key metric in evaluating the efficacy of tools employed by both research and practitioner communities for this purpose is code coverage. Obtaining code coverage typically requires planting probes within apps to gather coverage data during runtime. Due to the general unavailability of source code to analysts, there is a necessity for instrumenting apps to insert these probes in black-box environments. However, the tools available for such instrumentation are limited in their reliability and require intrusive changes interfering with apps' functionalities. This paper introduces AndroLog, a novel tool developed on top of the Soot framework, designed to provide fine-grained coverage information at multiple levels, including class, methods, statements, and Android components. In contrast to existing tools, AndroLog leaves the responsibility to test apps to analysts, and its motto is simplicity. As demonstrated in this paper, AndroLog can instrument up to 98% of recent Android apps compared to existing tools with 79% and 48% respectively for COSMO and ACVTool. AndroLog also stands out for its potential for future enhancements to increase granularity on demand. We make AndroLog available to the community and provide a video demonstration of AndroLog.
We present LgDb 2.0. The second generation of LgDb, an innovative framework for kernel code coverage, profiling and simulation. LgDb is built on top of Lguest and allows running an inspected kernel on a virtual enviro...
详细信息
ISBN:
(纸本)9781565553514
We present LgDb 2.0. The second generation of LgDb, an innovative framework for kernel code coverage, profiling and simulation. LgDb is built on top of Lguest and allows running an inspected kernel on a virtual environment instead of modifying the running kernel or using an extra target machine. LgDb 2.0 is using the Lguest hypervisor and the KGDB kernel debugger to debug and instrument kernel code. Unlike the standard approaches, LgDb enlist the hypervisor to achieve a better debugging environment for kernel development. LgDb strives to provide a generic environment for running performance evaluation and checking decision coverage for any inspected kernel. LgDb 2.0 improves over the original LgDb by using a simulated serial port and inspecting the tested code using KGDB. By using KGDB we eliminate the need for code injections making, profiling and code coverage testing easier.
The degree of code coverage reached by a test suite is an important indicator of the thoroughness of testing. Most coverage tools for Android apps work at the bytecode level and provide no information to developers ab...
详细信息
ISBN:
(纸本)9781728168364
The degree of code coverage reached by a test suite is an important indicator of the thoroughness of testing. Most coverage tools for Android apps work at the bytecode level and provide no information to developers about which source code lines have not yet been exercised by any test case. In this paper, we present COSMO, the first fully automated Android app instrumenter publicly available that operates at the source code level in a completely transparent way, making it fully compatible with existing system level testing technologies and Android test generators. The experiments that we have conducted on a large benchmark of Android apps show that COSMO can successfully instrument most apps without altering their execution traces, introducing a small, acceptable runtime overhead.
Reliable code coverage tools are critically important. as it is heavily used to facilitate many quality assurance activities, such as software testing, fuzzing, and debugging. However, little attention has been devote...
详细信息
ISBN:
(纸本)9781728108698
Reliable code coverage tools are critically important. as it is heavily used to facilitate many quality assurance activities, such as software testing, fuzzing, and debugging. However, little attention has been devoted to assessing the reliability of code coverage tools. In this study, we propose a randomized differential testing approach to hunting 14 hugs in the most widely used C code coverage tools. Specifically, by generating random input programs, our approach seeks for inconsistencies in code coverage reports produced by different code coverage tools, and then identifies inconsistencies as potential code coverage bugs. To effectively report code coverage bugs, we addressed three specific challenges: (1) How to filter out duplicate test programs as many of them triggering the same bugs in code coverage tools;(2) how to automatically reduce large test programs to much smaller ones that have the same properties;and (3) how to determine which code coverage tools have bugs? The extensive evaluations validate the effectiveness of our approach, resulting in 42 and 28 confirmed/fixed bugs for gcov and Ilvm-cov, respectively. This case study indicates that code coverage tools are not as reliable as it might have been envisaged. It not only demonstrates the effectiveness of our approach, but also highlights the need to continue improving the reliability of code coverage tools. This work opens up a new direction in code coverage validation which calls for more attention in this area.
暂无评论