Software systems often encounter various errors or exceptions in practice, and thus proper error handling code is essential to ensure the reliability of software systems. Unfortunately, error handling code is often bu...
详细信息
ISBN:
(数字)9798350330663
ISBN:
(纸本)9798350330670
Software systems often encounter various errors or exceptions in practice, and thus proper error handling code is essential to ensure the reliability of software systems. Unfortunately, error handling code is often bug-prone, while sufficiently testing them is challenging as such code often cannot be triggered under normal conditions. Motivated by this, recent studies have proposed to leverage software fault injection (SFI) based fuzzing to discover potential bugs in complicated error handling code. Despite the promising results achieved, their effectiveness and efficiency are still compromised in practice due to the huge search space of error sites, inadequate fuzzing guidance, and the overhead induced by context-sensitive SFI. To achieve effective and efficient testing of error handling code, this study presents AFL-FI, which first utilizes a similarity-based method to identify suspicious error sites, and then incorporates the idea of error site coverage to guide the fuzzing process. Finally, the design of lightweight context-sensitive SFI enables AFL-FI to execute test cases efficiently. We evaluate AFL-FI on eight large-scale open-source projects, and the results show that it can outperform existing state-of-the-art fuzzing tools significantly in terms of branch code coverage. More importantly, AFL-FI has discovered 13 previously unknown bugs, and all of them have been confirmed while 12 of them have been fixed. Besides, our evaluation also demonstrates that all the key designs of AFL- F I are effective that contribute significantly to its overall performance.
The pre-training language model BERT has brought significant performance improvements to a series of natural language processing tasks, but due to the large scale of the model, it is difficult to be applied in many pr...
The pre-training language model BERT has brought significant performance improvements to a series of natural language processing tasks, but due to the large scale of the model, it is difficult to be applied in many practical application scenarios. With the continuous development of edge computing, deploying the models on resource-constrained edge devices has become a trend. Considering the distributed edge environment, how to take into account issues such as data distribution differences, labeling costs, and privacy while the model is shrinking is a critical task. The paper proposes a new BERT distillation method with source-free unsupervised domain adaptation. By combining source-free unsupervised domain adaptation and knowledge distillation for optimization and improvement, the performance of the BERT model is improved in the case of cross-domain data. Compared with other methods, our method can improve the average prediction accuracy by up to around 4% through the experimental evaluation of the cross-domain sentiment analysis task.
Over the past decade, various methods for detecting side-channel leakage have been proposed and proven to be effective against CPU side-channel attacks. These methods are valuable in assisting developers to identify a...
详细信息
ISBN:
(数字)9798350341058
ISBN:
(纸本)9798350341065
Over the past decade, various methods for detecting side-channel leakage have been proposed and proven to be effective against CPU side-channel attacks. These methods are valuable in assisting developers to identify and patch side-channel vulnerabilities. Nevertheless, recent research has revealed the feasibility of exploiting side-channel vulnerabilities to steal sensitive information from GPU applications, which are beyond the reach of previous side-channel detection methods. Therefore, in this paper, we conduct an in-depth examination of various GPU features and present Owl, a novel side-channel detection tool targeting CUDA applications on NVIDIA GPUs. Owl is designed to detect and locate side-channel leakage in various types of CUDA applications. When tracking the execution of CUDA applications, we design a hierarchical tracing scheme and extend the A-DCFG (Attributed Dynamic Control Flow Graph) to address the massively parallel execution in CUDA, ensuring Owl's detection scalability. After completing the initial assessment and filtering, we conduct statistical tests on the differences in program traces to determine whether they are indeed caused by input variations, subsequently facilitating the positioning of side-channel leaks. We evaluate Owl's capability to detect side-channel leaks by testing it on Libgpucrypto, PyTorch, and nvJPEG. Meanwhile, we verify that our solution effectively handles a large number of threads. Owl has successfully identified hundreds of leaks within these applications. To the best of our knowledge, we are the first to implement side-channel leakage detection for general CUDA applications.
In the design and planning of next-generation Internet of Things(IoT),telecommunication,and satellite communication systems,controller placement is crucial in software-defined networking(SDN).The programmability of th...
详细信息
In the design and planning of next-generation Internet of Things(IoT),telecommunication,and satellite communication systems,controller placement is crucial in software-defined networking(SDN).The programmability of the SDN controller is sophisticated for the centralized control system of the entire ***,it creates a significant loophole for the manifestation of a distributed denial of service(DDoS)attack ***,recently a distributed Reflected Denial of Service(DRDoS)attack,an unusual DDoS attack,has been ***,minimal deliberation has given to this forthcoming single point of SDN infrastructure failure ***,recently the high frequencies of DDoS attacks have increased *** this paper,a smart algorithm for planning SDN smart backup controllers under DDoS attack scenarios has *** proposed smart algorithm can recommend single or multiple smart backup controllers in the event of DDoS *** obtained simulated results demonstrate that the validation of the proposed algorithm and the performance analysis achieved 99.99%accuracy in placing the smart backup controller under DDoS attacks within 0.125 to 46508.7 s in SDN.
Logic diagnosis is a key step in yield learning. Multiple faults diagnosis is challenging because of several reasons, including error masking, fault reinforcement, and huge search space for possible fault combinations...
详细信息
ISBN:
(数字)9783981926385
ISBN:
(纸本)9798350348606
Logic diagnosis is a key step in yield learning. Multiple faults diagnosis is challenging because of several reasons, including error masking, fault reinforcement, and huge search space for possible fault combinations. This work proposes a two-phase method for multiple-fault diagnosis. The first phase efficiently reduces the potential number of fault candidates through machine learning. The second phase obtains the final diagnosis results, by formulating the task as an combinational optimization problem that is later iteratively solved using binary evolution computation. Experiments shows that our method outperforms two existing methods for multiple-fault diagnosis, and achieves better diagnosability (improved by
$1.87\times$
) and resolution (improved by
$1.42\times$
) compared with a state-of-the-art commercial diagnosis tool.
Deep learning has been widely used in source code classification tasks, such as code classification according to their functionalities, code authorship attribution, and vulnerability detection. Unfortunately, the blac...
Deep learning has been widely used in source code classification tasks, such as code classification according to their functionalities, code authorship attribution, and vulnerability detection. Unfortunately, the black-box nature of deep learning makes it hard to interpret and understand why a classifier (i.e., classification model) makes a particular prediction on a given example. This lack of interpretability (or explainability) might have hindered their adoption by practitioners because it is not clear when they should or should not trust a classifier's prediction. The lack of interpretability has motivated a number of studies in recent years. However, existing methods are neither robust nor able to cope with out-of-distribution examples. In this paper, we propose a novel method to produce Robust interpreters for a given deep learning-based code classifier; the method is dubbed Robin. The key idea behind Robin is a novel hybrid structure combining an interpreter and two approximators, while leveraging the ideas of adversarial training and data augmentation. Experimental results show that on average the interpreter produced by Robin achieves a 6.11% higher fidelity (evaluated on the classifier), 67.22% higher fidelity (evaluated on the approximator), and 15.87x higher robustness than that of the three existing interpreters we evaluated. Moreover, the interpreter is 47.31% less affected by out-of-distribution examples than that of LEMNA.
Self-supervised learning usually uses a large amount of unlabeled data to pre-train an encoder which can be used as a general-purpose feature extractor, such that downstream users only need to perform fine-tuning oper...
Self-supervised learning usually uses a large amount of unlabeled data to pre-train an encoder which can be used as a general-purpose feature extractor, such that downstream users only need to perform fine-tuning operations to enjoy the benefit of "large model". Despite this promising prospect, the security of pre-trained encoder has not been thoroughly investigated yet, especially when the pre-trained encoder is publicly available for commercial *** this paper, we propose AdvEncoder, the first framework for generating downstream-agnostic universal adversarial examples based on the pre-trained encoder. AdvEncoder aims to construct a universal adversarial perturbation or patch for a set of natural images that can fool all the downstream tasks inheriting the victim pre-trained encoder. Unlike traditional adversarial example works, the pre-trained encoder only outputs feature vectors rather than classification labels. Therefore, we first exploit the high frequency component information of the image to guide the generation of adversarial examples. Then we design a generative attack framework to construct adversarial perturbations/patches by learning the distribution of the attack surrogate dataset to improve their attack success rates and transferability. Our results show that an attacker can successfully attack downstream tasks without knowing either the pre-training dataset or the downstream dataset. We also tailor four defenses for pre-trained encoders, the results of which further prove the attack ability of AdvEncoder. Our codes are available at: https://***/CGCL-codes/AdvEncoder.
As software engineering advances and the code demand rises, the prevalence of code clones has increased. This phenomenon poses risks like vulnerability propagation, underscoring the growing importance of code clone de...
详细信息
ISBN:
(数字)9798400702174
ISBN:
(纸本)9798350382143
As software engineering advances and the code demand rises, the prevalence of code clones has increased. This phenomenon poses risks like vulnerability propagation, underscoring the growing importance of code clone detection techniques. While numerous code clone detection methods have been proposed, they often fall short in real-world code environments. They either struggle to identify code clones effectively or demand substantial time and computational resources to handle complex clones. This paper introduces a code clone detection method namely Toma using tokens and machine learning. Specifically, we extract token type sequences and employ six similarity calculation methods to generate feature vectors. These vectors are then input into a trained machine learning model for classification. To evaluate the effectiveness and scalability of Toma, we conduct experiments on the widely used BigCloneBench dataset. Results show that our tool outperforms token-based code clone detectors and most tree-based clone detectors, demonstrating high effectiveness and significant time savings.
With the evolution of self-supervised learning, the pre-training paradigm has emerged as a predominant solution within the deep learning landscape. Model providers furnish pre-trained encoders designed to function as ...
详细信息
With the evolution of self-supervised learning, the pre-training paradigm has emerged as a predominant solution within the deep learning landscape. Model providers furnish pre-trained encoders designed to function as versatile feature extractors, enabling downstream users to harness the benefits of expansive models with minimal effort through fine-tuning. Nevertheless, recent works have exposed a vulnerability in pre-trained encoders, highlighting their susceptibility to downstream-agnostic adversarial examples (DAEs) meticulously crafted by attackers. The lingering question pertains to the feasibility of fortifying the robustness of downstream models against DAEs, particularly in scenarios where the pre-trained encoders are publicly accessible to the attackers. In this paper, we initially delve into existing defensive mechanisms against adversarial examples within the pre-training paradigm. Our findings reveal that the failure of current defenses stems from the domain shift between pre-training data and downstream tasks, as well as the sensitivity of encoder parameters. In response to these challenges, we propose Genetic Evolution-Nurtured Adversarial Fine-tuning (Gen-AF), a two-stage adversarial fine-tuning approach aimed at enhancing the robustness of downstream models. Gen-AF employs a genetic-directed dual-track adversarial fine-tuning strategy in its first stage to effectively inherit the pre-trained encoder. This involves optimizing the pre-trained encoder and classifier separately while incorporating genetic regularization to preserve the model’s topology. In the second stage, Gen-AF assesses the robust sensitivity of each layer and creates a dictionary, based on which the top-k robust redundant layers are selected with the remaining layers held fixed. Upon this foundation, we conduct evolutionary adaptability fine-tuning to further enhance the model’s generalizability. Our extensive experiments, conducted across ten self-supervised training methods and six
The Go language (Go/Golang) has been attracting increasing attention from the industry over recent years due to its strong concurrency support and ease of deployment. This programming language encourages developers to...
The Go language (Go/Golang) has been attracting increasing attention from the industry over recent years due to its strong concurrency support and ease of deployment. This programming language encourages developers to use channel-based concurrency, which simplifies the development of concurrent programs. Unfortunately, it also introduces new concurrency problems that differ from those caused by the mechanism of shared memory concurrency. However, there are only few works that aim to detect such Go-specific concurrency issues. Even state-of-the-art testing tools will miss critical concurrent bugs that require fine-grained and effective interleaving exploration. This paper presents GoPie, a novel testing approach for detecting Go concurrency bugs through primitive-constrained interleaving exploration. GoPie utilizes execution histories to identify new interleavings instead of relying on exhaustive exploration or random scheduling. To evaluate its performance, we applied GoPie to existing benchmarks and large-scale open-source projects. Results show that GoPie can effectively explore concurrent interleavings and detect significantly more bugs in the benchmark. Furthermore, it uncovered 11 unique previously unknown concurrent bugs, and 9 of which have been confirmed.
暂无评论