检索结果-内蒙古大学图书馆

Demo2Test: Transfer testing of Agent in Competitive Environment with Failure Demonstrations

ACM TRANSACTIONS ON software ENGINEERING AND METHODOLOGY 2025年第2期34卷 1-28页

作者： Chen, Jianming Wang, Yawen Wang, Junjie Xie, Xiaofei Wang, Dandan Wang, Qing Xu, Fanjiang Chinese Acad Sci Inst Software Beijing Peoples R China Singapore Management Univ Singapore Singapore

The competitive game between agents exists in many critical applications, such as military unmanned aerial vehicles. It is urgent to test these agents to reduce the significant losses caused by their failures. Existing studies mainly are to construct a testing agent that competes with the target agent to induce its failures. These approaches usually focus on a single task, requiring much more time for multi-task testing. However, if the previously tested tasks (source tasks) and the task to be tested (target task) share similar agents or task objectives, the transferable knowledge in source tasks can potentially increase the effectiveness of testing in the target task. We propose Demo2Test for conducting transfer testing of agents in the competitive environment, i.e., leveraging the demonstrations of failure scenarios from the source task to boost the testing effectiveness in the target task. It trains a testing agent with demonstrations and incorporates the action perturbation at key states to balance the number of revealed failures and their diversity. We conduct experiments in the simulated robotics competitive environments of MuJoCo. The results indicate that Demo2Test outperforms the best-performing baseline with improvements ranging from 22.38% to 87.98%, and 12.69% to 60.98%, in terms of the number and diversity of discovered failure scenarios, respectively.

关键词： Computer systems organization Reliability software and its engineering software testing and debugging Computing methodologies Adversarial learning

来源：评论

学校读者我要写书评

暂无评论

DREAM: debugging and Repairing AutoML Pipelines

引用

ACM TRANSACTIONS ON software ENGINEERING AND METHODOLOGY 2025年第4期34卷

作者： Zhang, Xiaoyu Zhai, Juan Ma, Shiqing Guan, Xiaohong Shen, Chao Xi An Jiao Tong Univ Xian Peoples R China Univ Massachusetts Amherst Amherst MA USA

Deep Learning models have become an integrated component of modern software systems. In response to the challenge of model design, researchers proposed Automated Machine Learning (AutoML) systems, which automatically search for model architecture and hyperparameters for a given task. Like other software systems, existing AutoML systems have shortcomings in their design. We identify two common and severe shortcomings in AutoML, performance issue (i.e., searching for the desired model takes an unreasonably long time) and ineffective search issue (i.e., AutoML systems are not able to find an accurate enough model). After analyzing the workflow of AutoML, we observe that existing AutoML systems overlook potential opportunities in search space, search method, and search feedback, which results in performance and ineffective search issues. Based on our analysis, we design and implement DREAM, an automatic and general-purpose tool to alleviate and repair the shortcomings of AutoML pipelines and conduct effective model searches for diverse tasks. It monitors the process of AutoML to collect detailed feedback and automatically repairs shortcomings by expanding search space and leveraging a feedback-driven search strategy. Our evaluation results show that DREAM can be applied on two state-of-the-art AutoML pipelines and effectively and efficiently repair their shortcomings.

关键词： Automated Machine Learning software testing and debugging AutoML Systems DL Model testing and Repair

来源：评论

学校读者我要写书评

暂无评论

SNOWPLOW: Effective Kernel Fuzzing with a Learned White-box Test Mutator 25

SNOWPLOW: Effective Kernel Fuzzing with a Learned White-box ...

引用

30th International Conference on Architectural Support for Programming Languages and Operating Systems-ASPLOS

作者： Gong, Sishuai Wang, Rui Altinbuken, Deniz Fonseca, Pedro Maniatis, Petros Purdue Univ W Lafayette IN 47907 USA Google DeepMind Mountain View CA USA

ISBN: (纸本)9798400710797

Kernel fuzzers rely heavily on program mutation to automatically generate new test programs based on existing ones. In particular, program mutation can alter the test's control and data flow inside the kernel by inserting new system calls, changing the values of call arguments, or performing other program mutations. However, due to the complexity of the kernel code and its user-space interface, finding the effective mutation that can lead to the desired outcome such as increasing the coverage and reaching a target code location is extremely difficult, even with the widespread use of manually-crafted heuristics. This work proposes SNOWPLOW, a kernel fuzzer that uses a learned white-box test mutator to enhance test mutation. The core of SNOWPLOW is an efficient machine learning model that can learn to predict promising mutations given the test program to mutate, its kernel code coverage, and the desired coverage. SNOWPLOW is demonstrated on argument mutations of the kernel tests, and evaluated on recent Linux kernel releases. When fuzzing the kernels for 24 hours, SNOWPLOW shows a significant speedup of discovering new coverage (4.8x similar to 5.2x) and achieves higher overall coverage (7.0%similar to 8.6%). In a 7-day fuzzing campaign, SNOWPLOW discovers 86 previously-unknown crashes. Furthermore, the learned mutator is shown to accelerate directed kernel fuzzing by reaching 19 target code locations 8.5x faster and two additional locations that are missed by the state-of-the-art directed kernel fuzzer.

关键词： Kernel fuzzing Operating systems reliability and security software testing and debugging

来源：评论

学校读者我要写书评

暂无评论

An Extensible Framework for Online testing of Choreographed Services

引用

COMPUTER 2014年第2期47卷 23-29页

作者： Ali, Midhat De Angelis, Francesco Fani, Daniele Bertolino, Antonia De Angelis, Guglielmo Polini, Andrea Univ Camerino Sch Sci & Technol Div Comp Sci I-62032 Camerino Italy CNR ISTI Italian Natl Res Council Inst Informat Sci & Technol I-56100 Pisa Italy CNR ISTI I-56100 Pisa Italy

Service choreographies present numerous engineering challenges, particularly with respect to testing activities, that traditional design-time approaches cannot properly address. A proposed online testing solution offers a powerful, extensible framework to effectively assess service compositions, leading to a more trustworthy and reliable service ecosystem.

关键词： software testing software Engineering Validation Collaboration Service Oriented Architecture software Architecture Online Services software Architectures software Engineering software Validation software testing and debugging software testing Strategies software testing Tools

来源：评论

学校读者我要写书评

暂无评论

Mitigating Noise in Quantum software testing Using Machine Learning

引用

IEEE TRANSACTIONS ON software ENGINEERING 2024年第11期50卷 2947-2961页

作者： Muqeet, Asmar Yue, Tao Ali, Shaukat Arcaini, Paolo Simula Res Lab N-0164 Oslo Norway Univ Oslo N-0313 Oslo Norway Oslo Metropolitan Univ N-0130 Oslo Norway Natl Inst Informat Tokyo 1018430 Japan

Quantum Computing (QC) promises computational speedup over classic computing. However, noise exists in near-term quantum computers. Quantum software testing (for gaining confidence in quantum software's correctness) is inevitably impacted by noise, i.e., it is impossible to know if a test case failed due to noise or real faults. Existing testing techniques test quantum programs without considering noise, i.e., by executing tests on ideal quantum computer simulators. Consequently, they are not directly applicable to test quantum software on real quantum computers or noisy simulators. Thus, we propose a noise-aware approach (named $\mathit{QOIN}$QOIN) to alleviate the noise effect on test results of quantum programs. $\mathit{QOIN}$QOIN employs machine learning techniques (e.g., transfer learning) to learn the noise effect of a quantum computer and filter it from a program's outputs. Such filtered outputs are then used as the input to perform test case assessments (determining the passing or failing of a test case execution against a test oracle). We evaluated $\mathit{QOIN}$QOIN on IBM's 23 noise models, Google's two available noise models, and Rigetti's Quantum Virtual Machine, with six real-world and 800 artificial programs. We also generated faulty versions of these programs to check if a failing test case execution can be determined under noise. Results show that $\mathit{QOIN}$QOIN can reduce the noise effect by more than $80\%$80% on most noise models. We used an existing test oracle to evaluate $\mathit{QOIN}$QOIN's effectiveness in quantum software testing. The results showed that $\mathit{QOIN}$QOIN attained scores of $99\%$99%, $75\%$75%, and $86\%$86% for precision, recall, and F1-score, respectively, for the test oracle across six real-world programs. For artificial programs, $\mathit{QOIN}$QOIN achieved scores of $93\%$93%, $79\%$79%, and $86\%$86% for precision, recall, and F1-score respectively. This highlights $\mathit{QOIN}$QOIN's effectiveness in le

关键词： Noise Quantum computing Qubit Computers software testing Logic gates Computational modeling software testing and debugging computing methodologies quantum computing and machine learning

来源：评论

学校读者我要写书评

暂无评论

mcfTRaptor: Toward unobtrusive on-the-fly control-flow tracing in multicores

引用

JOURNAL OF SYSTEMS ARCHITECTURE 2015年第10期61卷 601-614页

作者： Tewar, Amrish K. Myers, Albert R. Milenkovic, Aleksandar Univ Alabama Dept Elect & Comp Engn Huntsville AL 35899 USA

software testing and debugging has become the most critical aspect of the development of modern embedded systems, mainly driven by growing software and hardware complexity, increasing integration, and tightening time-to-market deadlines. software developers increasingly rely on on-chip trace and debug infrastructure to locate software bugs faster. However, the existing infrastructure offers limited visibility or relies on hefty on-chip buffers and wide trace ports that significantly increase system cost. This paper introduces a new technique called mcfTRaptor for capturing and compressing functional and time-stamped control-flow traces on-the-fly in modern multicore systems. It relies on private on-chip predictor structures and corresponding software modules in the debugger to significantly reduce the number of events that needs to be streamed out of the target platform. Our experimental evaluation explores the effectiveness of mcfrRaptor as a function of the number of cores, encoding mechanisms, and predictor configurations. When compared to the Nexus-like control-flow tracing, mcfTRaptor reduces the trace port bandwidth in the range from 14 to 23.8 times for functional traces and 10.8-18.6 times for time-stamped traces. (C) 2015 Elsevier BM. All rights reserved.

关键词： Real-time embedded systems Multicores software testing and debugging Program tracing

来源：评论

学校读者我要写书评

暂无评论

Searching Bug Instances in Gameplay Video Repositories

引用

IEEE TRANSACTIONS ON GAMES 2024年第3期16卷 697-710页

作者： Taesiri, Mohammad Reza Macklon, Finlay Habchi, Sarra Bezemer, Cor-Paul Univ Alberta Analyt Software GAmes & Repository Data ASGAARD La Edmonton AB T6G 2R3 Canada Ubisoft Montreal Montreal PQ H2T 1S6 Canada

Gameplay videos offer valuable insights into player interactions and game responses, particularly data about game bugs. Despite the abundance of gameplay videos online, extracting useful information remains a challenge. This article introduces a method for searching and extracting relevant videos from extensive video repositories using English text queries. Our approach requires no external information, like video metadata;it solely depends on video content. Leveraging the zero-shot transfer capabilities of the contrastive language-image pretraining model, our approach does not require any data labeling or training. To evaluate our approach, we present the GamePhysics dataset, comprising 26 954 videos from 1873 games that were collected from the GamePhysics section on the Reddit website. Our approach shows promising results in our extensive analysis of simple and compound queries, indicating that our method is useful for detecting objects and events in gameplay videos. Moreover, we assess the effectiveness of our method by analyzing a carefully annotated dataset of 220 gameplay videos. The results of our study demonstrate the potential of our approach for applications, such as the creation of a video search tool tailored to identifying video game bugs, which could greatly benefit quality assurance teams in finding and reproducing bugs.

关键词： Training software testing Video games Visualization Quality assurance Social networking (online) Computer bugs software testing and debugging video mining bug reports video games video retrieval

来源：评论

学校读者我要写书评

暂无评论

SURE: A Visualized Failure Indexing Approach Using Program Memory Spectrum

引用

ACM TRANSACTIONS ON software ENGINEERING AND METHODOLOGY 2024年第8期33卷 1-43页

作者： Song, Yi Zhang, Xihao Xie, Xiaoyuan Chen, Songqiang Liu, Quanming Gao, Ruizhi Wuhan Univ Sch Comp Sci Wuhan Peoples R China Hong Kong Univ Sci & Technol Hong Kong Hong Kong Peoples R China Sonos Inc Boston MA 02111 USA

This work was partially supported by the National Natural Science Foundation of China under the grant number 62250610224.

关键词： software and its engineering software testing and debugging

来源：评论

学校读者我要写书评

暂无评论

Queueing-Theory-Based Models for software Reliability Analysis and Management

引用

IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING 2017年第4期5卷 540-550页

作者： Huang, Chin-Yu Kuo, Tzu-Yu Natl Tsing Hua Univ Dept Comp Sci Hsinchu 300 Taiwan Garmin Corp Dept Engn New Taipei Taiwan

software reliability is one of the most important internal attributes of software systems. Over the past three decades, many software reliability growth models have been proposed and discussed. Some research has also shown that the fault detection and removal processes of software can be described and modeled using an infinite-server queueing system. But, there is practically no company that can afford unlimited resources to test and correct detected faults in the real world. Consequently, the number of debuggers should be limited, not infinite. In this paper, we propose an extended finite-server-queueing (EFSQ) model to analyze the fault removal process of the software system. Numerical examples based on real project data are illustrated. Evaluation results show that our proposed EFSQ model has a fairly accurate prediction capability of software reliability and also depict the real-life situation of software development activities more faithfully. Finally, the applications of our proposed model to project management are also presented. Our proposed model can provide a theoretically effective technique for managing the necessary activities of testing and debugging in software project management.

关键词： software testing and debugging queueing theory software reliability growth model (SRGM) project management non-homogeneous Poisson process (NHPP)

来源：评论

学校读者我要写书评

暂无评论

An Integrated Framework for Developing Discrete-Time Modelling in software Reliability Engineering

引用

QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL 2016年第8期32卷 2925-2943页

作者： Shatnawi, Omar Al Al Bayt Univ Dept Comp Sci Mafraq 25113 Jordan

In the software reliability engineering literature, few attempts have been made to study the fault debugging environment using discrete-time modelling. Most endeavours assume that a detected fault to have been either immediately removed or is perfectly debugged. Such discrete-time models may be used for any debugging environment and may be termed black-box, which are used without having prior knowledge about the nature of the fault being debugged. However, if one has to develop a white-box model, one needs to be cognizant of the debugging environment. During debugging, there are numerous factors that affect the debugging process. These factors may include the internal, for example, fault density, and fault debugging complexity and the external that originates in the debugging environment itself, such as the skills of the debugging team and the debugging effort expenditures. Hence, the debugging environment fault removal may take a longer time after having been detected. Therefore, it is imperative to clearly understand the testing and debugging environment and, hence, the urgency to develop a model. The model ought to take into account the fault debugging complexity and incorporate the learning phenomenon of the debugger under imperfect debugging environment. This objective dictates developing a framework through an integrated modelling approach based on nonhomogenous Poisson process that incorporates these realistic factors during the fault debugging process. Actual software reliability data have been used to demonstrate applicability of the proposed integrated framework. Copyright (C) 2016 John Wiley & Sons, Ltd.

关键词： software reliability engineering software testing and debugging nonhomogenous Poisson process imperfect fault debugging fault debugging complexity

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：