Effective techniques for post-silicon validation are required to better evaluate functional correctness of increasingly complex multi and many-core SoCs. However, there is little data evaluating the coverage of post-s...
详细信息
Effective techniques for post-silicon validation are required to better evaluate functional correctness of increasingly complex multi and many-core SoCs. However, there is little data evaluating the coverage of post-silicon validation efforts on industrial-scale designs. In this paper, we address this knowledge gap by instrumenting a nontrivial SoC with on-chip coverage monitors to measure the coverage achieved by typical post-silicon validation tests, such as booting the operating system (OS). We compare coverage achieved pre and post-silicon, and also measure the area overhead required to monitor post-silicon coverage. Our results show that the typical test of booting the OS often achieves high coverage, well correlated to what is achieved by pre-silicon directed tests, but in some blocks the coverage can be low or markedly different between pre and post-silicon, highlighting the importance of post-silicon validation in general and post-silicon coverage measurement in particular.
Today, there are millions of third-party Android applications. Some of them are buggy or even malicious. To identify such applications, novel frameworks for automated black-box testing and dynamic analysis are being d...
详细信息
Today, there are millions of third-party Android applications. Some of them are buggy or even malicious. To identify such applications, novel frameworks for automated black-box testing and dynamic analysis are being developed by the Android community. code coverage is one of the most common metrics for evaluating effectiveness of these frameworks. Furthermore, code coverage is used as a fitness function for guiding evolutionary and fuzzy testing techniques. However, there are no reliable tools for measuring fine-grained code coverage in black-box Android app testing. We present the Android code coverage Tool, ACVTool for short, that instruments Android apps and measures code coverage in the black-box setting at class, method and instruction granularity. ACVTool has successfully instrumented 96.9% of apps in our experiments. It introduces a negligible instrumentation time overhead, and its runtime overhead is acceptable for automated testing tools. We demonstrate practical value of ACV-Tool in a large-scale experiment with Sapienz, a state-of-the-art automated testing tool. Using ACVTool on the same cohort of apps, we have compared different coverage granularities applied by Sapienz in terms of the found amount of crashes. Our results showthat none of the applied coverage granularities clearly outperforms others in this aspect.
code coverage criteria are commonly used to determine the adequacy of a test suite. However, studies investigating code coverage and fault-finding capabilities have mixed results. Some studies have shown that creating...
详细信息
code coverage criteria are commonly used to determine the adequacy of a test suite. However, studies investigating code coverage and fault-finding capabilities have mixed results. Some studies have shown that creating test suites to satisfy coverage criteria has a positive effect on finding faults, while other studies do not. In order to improve the fault-finding capabilities of test suites, it is essential to understand what is causing these mixed results. In this study, we investigated one possible source of variation in the results observed: fault type. Specifically, we studied 45 different types of faults and evaluated how effectively human-created test suites with high coverage percentages were able to detect each type of fault. Our results showed, with statistical significance, there were specific types of faults found less frequently than others. However, improvements in the formulation and selection of test oracles could overcome these weaknesses. Based on our results and the types of faults that were missed, we suggest focusing on the strength of test oracles along with code coverage to improve the effectiveness of test suites.
code coverage has been used in the software testing context mostly as a metric to assess a generated test suite's quality. Recently, code coverage analysis is used as a white-box testing technique for test optimiz...
详细信息
ISBN:
(数字)9783030672201
ISBN:
(纸本)9783030672201;9783030672195
code coverage has been used in the software testing context mostly as a metric to assess a generated test suite's quality. Recently, code coverage analysis is used as a white-box testing technique for test optimization. Most of the research activities focus on using code coverage for test prioritization and selection within automated testing strategies. Less effort has been paid in the literature to use code coverage for test generation. This paper introduces a new code coverage-based Test Case Generation (CCTG) concept that changes the current practices by utilizing the code coverage analysis in the test generation process. CCTG uses the code coverage data to calculate the input parameters' impact for a constraint solver to automate the generation of effective test suites. We applied this approach to a few real-world case studies. The results showed that the new test generation approach could generate effective test cases and detect new faults.
Assertions are gaining importance in pre-silicon hardware verification to ensure expected design behavior. coverage of an assertion in terms of statements of a Register Transfer Level (RTL) source code is a very acces...
详细信息
ISBN:
(纸本)9781450327305
Assertions are gaining importance in pre-silicon hardware verification to ensure expected design behavior. coverage of an assertion in terms of statements of a Register Transfer Level (RTL) source code is a very accessible metric for understanding the scope of assertions and for debug. However, few methods to report it currently exist. We present a methodology to define and compute code coverage of an assertion. Our method is based on static and dynamic analysis of the RTL source code. We demonstrate the scalability and effectiveness of our approach with experimental results on real designs for both manual and automatically generated assertions.
Understanding code coverage is an important precursor to software maintenance activities (e.g., better testing). Although modern code coverage tools provide key insights, they typically rely on code instrumentation, r...
详细信息
Understanding code coverage is an important precursor to software maintenance activities (e.g., better testing). Although modern code coverage tools provide key insights, they typically rely on code instrumentation, resulting in significant performance overhead. An alternative approach to code instrumentation is to process an application's source code and the associated log traces in tandem. This so-called "log-based code coverage" approach does not impose the same performance overhead as code instrumentation. Chen et al. proposed LogCoCo - a tool that implements log-based code coverage for Java. While LogCoCo breaks important new ground, it has fundamental limitations, namely: uncertainty due to the lack of logging statements in conditional branches, and imprecision caused by dependency injection. In this study, we propose Log2Cov, a tool that generates log-based code coverage for programs written in Python and addresses uncertainty and imprecision issues. We evaluate Log2Cov on three large and active open-source systems. More specifically, we compare the performance of Log2Cov to that of ***, an instrumentation-based coverage tool for Python. Our results indicate that 1) Log2Cov achieves high precision without introducing runtime overhead;and 2) uncertainty and imprecision can be reduced by up to 11% by statically analyzing the program's source code and execution logs, without requiring additional logging instrumentation from developers. While our enhancements make substantial improvements, we find that future work is needed to handle conditional statements and exception handling blocks to achieve parity with instrumentation-based approaches. We conclude the paper by drawing attention to these promising directions for future work.
Unit code Test coverage has long been known to be an important metric for testing software, and many development groups require 85% coverage to achieve quality targets. Assume we have a test, T-1, which has 100% code ...
详细信息
ISBN:
(纸本)0780366158
Unit code Test coverage has long been known to be an important metric for testing software, and many development groups require 85% coverage to achieve quality targets. Assume we have a test, T-1, which has 100% code coverage and it detects a set of defects, D-1. The question, which will be answered here, is "What percentage of the defects in D-1 will be detected if a random subset of the testes in T-1 are applied to the code, which has code coverage of X% of the code?" The purpose of this paper is to show the relation between code quality and code coverage. The relationship will be derived via a model of code defect levels. A sampling technique will be employed and modeled with the hypergeometric distribution while assuming a uniform probability and a random distribution of defects in the code. which will invoke the binomial distribution. The result of this analysis will be a simple relation between defect level and quality of the code delivered after the Unit code is tested. This model results in the rethinking of the use of Unit code test metrics and the use of support tools.
This paper investigates the impact of code coverage on machine learning-based dynamic analysis of Android malware. In order to maximize the code coverage, dynamic analysis on Android typically requires the generation ...
详细信息
This paper investigates the impact of code coverage on machine learning-based dynamic analysis of Android malware. In order to maximize the code coverage, dynamic analysis on Android typically requires the generation of events to trigger the user interface and maximize the discovery of the run-time behavioral features. The commonly used event generation approach in most existing Android dynamic analysis systems is the random-based approach implemented with the Monkey tool that comes with the Android SDK. Monkey is utilized in popular dynamic analysis platforms like AASandbox, vetDroid, MobileSandbox, TraceDroid, Andrubis, ANANAS, DynaLog, and HADM. In this paper, we propose and investigate approaches based on stateful event generation and compare their code coverage capabilities with the state-of-the-practice random-based Monkey approach. The two proposed approaches are the state-based method (implemented with DroidBot) and a hybrid approach that combines the state-based and random-based methods. We compare the three different input generation methods on real devices, in terms of their ability to log dynamic behavior features and the impact on various machine learning algorithms that utilize the behavioral features for malware detection. Experiments performed using 17,444 applications show that overall, the proposed methods provide much better code coverage which in turn leads to more accurate machine learning-based malware detection compared to the state-of- the- art approach.
Fuzzing is an effective approach to detect software vulnerabilities utilizing changeable generated inputs. However, fuzzing the network protocol on the firmware of IoT devices is limited by inefficiency of test case g...
详细信息
Fuzzing is an effective approach to detect software vulnerabilities utilizing changeable generated inputs. However, fuzzing the network protocol on the firmware of IoT devices is limited by inefficiency of test case generation, cross-architecture instrumentation, and fault detection. In this article, we propose the Fw-fuzz, a coverage-guided and crossplatform framework for fuzzing network services running in the context of firmware on embedded architectures, which can generate more valuable test cases by introspecting program runtime information and using a genetic algorithm model. Specifically, we propose novel dynamic instrumentation in Fw-fuzz to collect the running state of the firmware program. Then Fw-fuzz adopts a genetic algorithm model to guide the generation of inputs with high code coverage. We fully implement the prototype system of Fw-fuzz and conduct evaluations on network service programs of various architectures in MIPS, ARM, and PPC. By comparing with the protocol fuzzers Boofuzz and Peach in metrics of edge coverage, our prototype system achieves an average growth of 33.7% and 38.4%, respectively. We further verify six known vulnerabilities and discover 5 0-day vulnerabilities with the Fw-fuzz, which prove the validity and utility of our framework. The overhead of our system expressed as an additional 5% of memory growth.
The calculation of test coverage is often unfeasible for large-scale mining software repositories studies, as its computation requires building each project and executing their test suites. Because of that, we have be...
详细信息
ISBN:
(纸本)9781467392723
The calculation of test coverage is often unfeasible for large-scale mining software repositories studies, as its computation requires building each project and executing their test suites. Because of that, we have been working on heuristics to calculate code coverage based on static code analysis. However, our results have been disappointing so far. In this paper, we present our approach to the problem and an evaluation involving 18 open source projects (around 2,700 classes) from the Apache Software Foundation. Results show that our approach provides acceptable results for only 50% of all classes. We believe researchers can learn from our mistakes and possibly derive a better approach. We advise researchers who need to use code coverage in their studies to select projects with a well-defined build system, such as Maven.
暂无评论