Deep Neural Networks (DNN) have realized significant achievements across various application domains. There is no doubt that testing and enhancing a pre-trained DNN that has been deployed in an application scenario is...
详细信息
Deep Neural Networks (DNN) have realized significant achievements across various application domains. There is no doubt that testing and enhancing a pre-trained DNN that has been deployed in an application scenario is crucial, because it can reduce the failures of the DNN. DNN-driven software testing and enhancement require large amounts of labeled data. The high cost and inefficiency caused by the large volume of data of manual labeling, and the time consumption of testing all cases in real scenarios are unacceptable. Therefore, test case selection technologies are proposed to reduce the time cost by selecting and only labeling representative test cases without compromising testing performance. Test case selection based on neuron coverage (NC) or uncertainty metrics has achieved significant success in Convolutional Neural Networks (CNN) testing. However, it is challenging to transfer these methods to Recurrent Neural Networks (RNN), which excel at text tasks, due to the mismatch in model output formats and the reliance on image-specific characteristics. What’s more, balancing the execution cost and performance of the algorithm is also indispensable. In this paper, we propose a state-vector aware test case selection method for RNN models, namely DeepVec, which reduces the cost of data labeling and saves computing resources and balances the execution cost and performance. DeepVec selects data using uncertainty metric based on the norm of the output vector at each time step (i.e., state-vector), and similarity metric based on the direction angle of the state-vector. Because test cases with smaller state-vector norms often possess greater information entropy and similar changes of state-vector direction angle indicate similar RNN internal states. These metrics can be calculated with just a single inference, which gives it strong bug detection and model improvement capabilities. We evaluate DeepVec on five popular datasets, containing images and texts as well as commonl
Several digital dangers were investigated. Malware dominated analysis with 45 attacks. We found 30 phishing attacks. 22 data breaches, 15 cyber espionage, 18 identity theft. This indicates the kind and frequency of ha...
详细信息
Feature selection is a critical aspect of improving the interpretability of machine learning models. Genetic Programming (GP) has a built-in feature selection mechanism that explores the search space to include inform...
详细信息
Graphs are valuable data structures used to represent complex relationships between entities in a wide range of applications, such as social networks and chemical reactions. Subgraph counting problem is a well-known h...
详细信息
In today's dynamic world, providing inclusive and personalized support for individuals with physical disabilities is imperative. With diverse needs and preferences, tailored assistance according to user personas i...
详细信息
The job shop scheduling problem is an important combinatorial optimisation problem in the real world. Genetic programming hyper-heuristic has been successfully applied to automatically evolve effective dispatching rul...
详细信息
The Paper focuses on analysing neural network models that are used for semantically classifying tabular customer datasets. Additionally, we propose a custom neural network architecture to analyze tabular datasets and ...
详细信息
Recent advances [1, 2] in offline reinforcement learning(RL)have taken a new perspective on the problem, departing from conventional methods that concentrate on learning value functions or policy gradients. Instead, t...
Recent advances [1, 2] in offline reinforcement learning(RL)have taken a new perspective on the problem, departing from conventional methods that concentrate on learning value functions or policy gradients. Instead, the problem is viewed as a generic sequence modeling task, where past experiences consisting of state-action-reward triplets are input to the Transformer.
Sparse multi-dimensional gene expression data refers to datasets that has a vast number of features and observations, where a substantial portion of the entries are zero or missing values. In such datasets, the number...
详细信息
Text-based Statistical steganography is one of the most non-human detectable methods of embedding hidden messages in plain text format which is useful in concealing information. Steganalysis is its counter, the proces...
详细信息
暂无评论