With the evolution of self-supervised learning, the pre-training paradigm has emerged as a predominant solution within the deep learning landscape. Model providers furnish pre-trained encoders designed to function as ...
详细信息
With the evolution of self-supervised learning, the pre-training paradigm has emerged as a predominant solution within the deep learning landscape. Model providers furnish pre-trained encoders designed to function as versatile feature extractors, enabling downstream users to harness the benefits of expansive models with minimal effort through fine-tuning. Nevertheless, recent works have exposed a vulnerability in pre-trained encoders, highlighting their susceptibility to downstream-agnostic adversarial examples (DAEs) meticulously crafted by attackers. The lingering question pertains to the feasibility of fortifying the robustness of downstream models against DAEs, particularly in scenarios where the pre-trained encoders are publicly accessible to the attackers. In this paper, we initially delve into existing defensive mechanisms against adversarial examples within the pre-training paradigm. Our findings reveal that the failure of current defenses stems from the domain shift between pre-training data and downstream tasks, as well as the sensitivity of encoder parameters. In response to these challenges, we propose Genetic Evolution-Nurtured Adversarial Fine-tuning (Gen-AF), a two-stage adversarial fine-tuning approach aimed at enhancing the robustness of downstream models. Gen-AF employs a genetic-directed dual-track adversarial fine-tuning strategy in its first stage to effectively inherit the pre-trained encoder. This involves optimizing the pre-trained encoder and classifier separately while incorporating genetic regularization to preserve the model’s topology. In the second stage, Gen-AF assesses the robust sensitivity of each layer and creates a dictionary, based on which the top-k robust redundant layers are selected with the remaining layers held fixed. Upon this foundation, we conduct evolutionary adaptability fine-tuning to further enhance the model’s generalizability. Our extensive experiments, conducted across ten self-supervised training methods and six
Hypergraph Neural Network (HyperGNN) has emerged as a potent methodology for dissecting intricate multilateral connections among various entities. Current software/hardware solutions leverage a sequential execution mo...
详细信息
ISBN:
(数字)9798350350579
ISBN:
(纸本)9798350350586
Hypergraph Neural Network (HyperGNN) has emerged as a potent methodology for dissecting intricate multilateral connections among various entities. Current software/hardware solutions leverage a sequential execution model that relies on hyperedge and vertex indices for conducting standard matrix operations for HyperGNN inference. Yet, they are impeded by the dual challenges of redundant computation and irregular memory access overheads. This is primarily due to the frequent and repetitive access and updating of a number of feature vectors corresponding to the same hyperedges and vertices. To address these challenges, we propose the first redundancy-aware accelerator, RAHP, which enables high performance execution of HyperGNN inference. Specifically, we present a redundancy-aware asynchronous execution approach into the accelerator design for HyperGNN to reduce redundant computations and off-chip memory accesses. To unveil opportunities for data reuse and unlock the parallelism that existing HyperGNN solutions fail to capture, it prioritizes vertices with the highest degree as roots, prefetching other vertices along the hypergraph structure to capture the common vertices among multiple hyperedges, and synchronizing the computations of hyperedges and vertices in real-time. By such means, this facilitates the concurrent processing of relevant hyperedge and vertex computations of the common vertices along the hypergraph topology, resulting in smaller redundant computations overhead. Furthermore, by efficiently caching intermediate results of the common vertices, it curtails memory traffic and off-chip communications. To fully harness the performance potential of our proposed approach in the accelerator, RAHP incorporates a topology-driven data loading mechanism to minimize off-chip memory accesses on the fly. It is also endowed with an adaptive data synchronization scheme to mitigate the effects of conflicting updates of both hyperedges and vertices. Moreover, RAHP emplo
Federated learning (FL) has been demonstrated to be susceptible to backdoor attacks. However, existing academic studies on FL backdoor attacks rely on a high proportion of real clients with main task-related data, whi...
详细信息
Java Virtual Machine (JVM) is the fundamental software system that supports the interpretation and execution of Java bytecode. To support the surging performance demands for the increasingly complex and large-scale Ja...
Java Virtual Machine (JVM) is the fundamental software system that supports the interpretation and execution of Java bytecode. To support the surging performance demands for the increasingly complex and large-scale Java programs, Just-In-Time (JIT) compiler was proposed to perform sophisticated runtime optimization. However, this inevitably induces various bugs, which are becoming more pervasive over the decades and can often cause significant consequences. To facilitate the design of effective and efficient testing techniques to detect JIT compiler bugs. This study first performs a preliminary study aiming to understand the characteristics of JIT compiler bugs and the corresponding triggering test cases. Inspired by the empirical findings, we propose JOpFuzzer, a new JVM testing approach with a specific focus on JIT compiler bugs. The main novelty of JOpFuzzer is embodied in three aspects. First, besides generating new seeds, JOpFuzzer also searches for diverse configurations along the new dimension of optimization options. Second, JOpFuzzer learns the correlations between various code features and different optimization options to guide the process of seed mutation and option exploration. Third, it leverages the profile data, which can reveal the program execution information, to guide the fuzzing process. Such nov-elties enable JOpFuzzer to effectively and efficiently explore the two-dimensional input spaces. Extensive evaluation shows that JOpFuzzer outperforms the state-of-the-art approaches in terms of the achieved code coverages. More importantly, it has detected 41 bugs in OpenJDK, and 25 of them have already been confirmed or fixed by the corresponding developers.
SMT solvers check the satisfiability of logic formulas over first-order theories, which have been utilized in a rich number of critical applications, such as software verification, test case generation, and program sy...
SMT solvers check the satisfiability of logic formulas over first-order theories, which have been utilized in a rich number of critical applications, such as software verification, test case generation, and program synthesis. Bugs hidden in SMT solvers would severely mislead those applications and further cause severe consequences. Therefore, ensuring the reliability and robustness of SMT solvers is of critical importance. Although many approaches have been proposed to test SMT solvers, it is still a challenge to discover bugs effectively. To tackle such a challenge, we conduct an empirical study on the historical bug-triggering formulas in SMT solvers' bug tracking systems. We observe that the historical bug-triggering formulas contain valuable skeletons (i.e., core structures of formulas) as well as associated atomic formulas which can cast significant impacts on formulas' ability in triggering bugs. Therefore, we propose a novel approach that utilizes the skeletons extracted from the historical bug-triggering formulas and enumerates atomic formulas under the guidance of association rules derived from historical formulas. In this study, we realized our approach as a practical fuzzing tool HistFuzz and conducted extensive testing on the well-known SMT solvers Z3 and cvc5. To date, HistFuzz has found 111 confirmed new bugs for Z3 and cvc5, of which 108 have been fixed by the developers. More notably, out of the confirmed bugs, 23 are soundness bugs and invalid model bugs found in the solvers' default mode, which are essential for SMT solvers. In addition, our experiments also demonstrate that HistFuzz outperforms the state-of-the-art SMT solver fuzzers in terms of achieved code coverage and effectiveness.
Existing privacy-preserving approaches are generally designed to provide privacy guarantee for individual data in a database, which reduces the utility of the database for data analysis. In this paper, we propose a no...
详细信息
The natural bijection between a proposed circuit design and its graph representation shall allow any graph optimization algorithm deploying into many-core systems efficiently. However, this process suffers from the ex...
The natural bijection between a proposed circuit design and its graph representation shall allow any graph optimization algorithm deploying into many-core systems efficiently. However, this process suffers from the exponentially growing overhead and heavy memory footprint with the signal propagation. To conquer the unique challenge, we systematically study the simulation with millions of gates, and identify that the processing complexity could grow exponentially from the signal inputs, the skewness of the computational graph stays. Thus, we present ZhouBi, a fast and scalable gate-level simulation framework to fully exploit the parallelism from many-core systems. ZhouBi contributes in threefolds, (I) a graph representation that colors gate-level netlists and identifies skew partitions based on the graph skewness; (II) A set of heuristic algorithms that picks opportunistic and conservative algorithms to accelerate the simulation; (III) A system facility that supports selective mapping between simulation and many-core, providing a tradeoff between the risk of concurrent simulation fail and performance gain. We have prototyped ZhouBi and evaluated it with practical baselines. ZhouBi can achieve a 27.6× performance gain, as compared to the state-of-the-practice Veriwell without compromising any correctness. Our framework supports large graphs enabling scale-out gate-level simulations for chip design.
Large Language Models (LLMs) have shown remarkable progress in automated code generation. Yet, LLM-generated code may contain errors in API usage, class, data structure, or missing project-specific information. As muc...
详细信息
The attention-based neural network attracts great interest due to its excellent accuracy enhancement. However, the attention mechanism requires huge computational efforts to process unnecessary calculations, significa...
详细信息
Heterogeneous memory systems have become increasingly popular in recent years. Because heterogeneous storage media often show significantly different characteristics in terms of bandwidth, latency, capacity, and energ...
详细信息
Heterogeneous memory systems have become increasingly popular in recent years. Because heterogeneous storage media often show significantly different characteristics in terms of bandwidth, latency, capacity, and energy consumption, it is still challenging to best utilize them for cost-efficient and energy-efficient heterogeneous memory systems. In this paper, we propose a simulation framework for multi-tiered heterogeneous memory architectures based on GEM5 and DRAMsim3 simulators. We design a heterogeneous memory controller to architect Non-Volatile Memory (NVM) as main memory, and architect both Dynamic Random Access Memory (DRAM) and High-Bandwidth Memory (HBM) as a hybrid cache of NVM. Specifically, HBM, DRAM, and NVM are managed in a single (flat) address space. However, we use an address remapping table to maintain the mappings between NVM pages and HBM/DRAM pages, and logically manage HBM/DRAM/NVM as a three-tiered hybrid memory system. We also design a hardware-supported hot page monitor based on Majority Element Algorithm (MEA) to identify the hottest pages in the DRAM, and a dynamic threshold adjustment scheme for hot page migration to balance the memory bandwidth between DRAM and HBM. Our multi-tiered heterogeneous memory architecture can take advantage of the large capacity of NVM, the low latency of DRAM, and the high bandwidth of HBM concurrently. Experimental results show that our tiered memory architecture can improve application performance by an average of $2.5\times$ compared with an NVM-only architecture, and up to 57.4% compared with a DRAM-only architecture. Moreover, the performance gap between our HBM/DRAM/NVM architecture and a HBM-only architecture is less than 10%.
暂无评论