With the advancement of deep learning, object detectors (ODs) with various architectures have achieved significant success in complex scenarios like autonomous driving. Previous adversarial attacks against ODs have be...
详细信息
Traditional unlearnable strategies have been proposed to prevent unauthorized users from training on the 2D image data. With more 3D point cloud data containing sensitivity information, unauthorized usage of this new ...
It has been recognized that one of the bottlenecks in the UTXO-based blockchain systems is the slow block validation - the process of validating a newly-received block by a node before locally storing it and further b...
详细信息
It has been recognized that one of the bottlenecks in the UTXO-based blockchain systems is the slow block validation - the process of validating a newly-received block by a node before locally storing it and further broadcasting it. As a block contains multiple inputs, the block validation mainly involves checking the inputs against the status data, which is also known as the Unspent Transaction Outputs (UTXO) set. As time goes by, the UTXO set becomes more and more expansive, most of which can only be stored on disks. This considerably slows down the input checking and thus block validation, which can potentially compromise system security. To deal with the above problem, we disassemble the function of input checking into three parts: existence validation (EV), unspent validation (UV), and script validation (SV). Based on the disassembly, we propose EBV, an efficient block validation mechanism to speed up EV, UV, and SV individually. First, EBV changes the representation of status data, from UTXO set to a bit-vector set, which drastically reduces its size. The smaller status data can be entirely maintained in memory, thereby accelerating UV and also block validation. Second, EBV requires each transaction to carry the proof data, which enables EV and SV without accessing the disks. Furthermore, we also cope with two challenges in the design of EBV, namely transaction inflation and fake positions. To evaluate the EBV mechanism, we implement a prototype on top of Bitcoin, the most widely known UTXO-based blockchain, and conduct extensive experiments to compare EBV and Bitcoin. The experimental results demonstrate that EBV successfully reduces the memory requirement by 93.1 % and the block validation time by up to 93.5%.
Logic diagnosis is a key step in yield learning. Multiple faults diagnosis is challenging because of several reasons, including error masking, fault reinforcement, and huge search space for possible fault combinations...
详细信息
ISBN:
(数字)9783981926385
ISBN:
(纸本)9798350348606
Logic diagnosis is a key step in yield learning. Multiple faults diagnosis is challenging because of several reasons, including error masking, fault reinforcement, and huge search space for possible fault combinations. This work proposes a two-phase method for multiple-fault diagnosis. The first phase efficiently reduces the potential number of fault candidates through machine learning. The second phase obtains the final diagnosis results, by formulating the task as an combinational optimization problem that is later iteratively solved using binary evolution computation. Experiments shows that our method outperforms two existing methods for multiple-fault diagnosis, and achieves better diagnosability (improved by
$1.87\times$
) and resolution (improved by
$1.42\times$
) compared with a state-of-the-art commercial diagnosis tool.
Approximate nearest neighbor search (ANNS) has emerged as a crucial component of database and AI infrastructure. Ever-increasing vector datasets pose significant challenges in terms of performance, cost, and accuracy ...
ISBN:
(纸本)9781939133458
Approximate nearest neighbor search (ANNS) has emerged as a crucial component of database and AI infrastructure. Ever-increasing vector datasets pose significant challenges in terms of performance, cost, and accuracy for ANNS services. None of modern ANNS systems can address these issues simultaneously. In this paper, we present Fusion-ANNS, a high-throughput, low-latency, cost-efficient, and high-accuracy ANNS system for billion-scale datasets using SSDs and only one entry-level GPU. The key idea of Fusion-ANNS lies in CPU/GPU collaborative filtering and reranking mechanisms, which significantly reduce I/O operations across CPUs, GPU, and SSDs to break through the I/O performance bottleneck. Specifically, we propose three novel designs: (1) multi-tiered indexing to avoid data swapping between CPUs and GPU, (2) heuristic re-ranking to eliminate unnecessary I/Os and computations while guaranteeing high accuracy, and (3) redundant-aware I/O deduplication to further improve I/O efficiency. We implement FusionANNS and compare it with the state-of-the-art SSD-based ANNS system-SPANN and GPU-accelerated in-memory ANNS system-RUMMY. Experimental results show that FusionANNS achieves 1) 9.4-13.1× higher query per second (QPS) and 5.7-8.8× higher cost efficiency compared with SPANN; 2) and 2-4.9× higher QPS and 2.3-6.8× higher cost efficiency compared with RUMMY, while guaranteeing low latency and high accuracy.
In this paper, we propose a Graph Inception Diffusion Networks(GIDN) model. This model generalizes graph diffusion in different feature spaces, and uses the inception module to avoid the large amount of computations c...
详细信息
Evaluating and enhancing the general capabilities of large language models (LLMs) has been an important research topic. Graph is a common data structure in the real world, and understanding graph data is a crucial par...
详细信息
Automatically generating webpage code from webpage designs can significantly reduce the workload of front-end developers, and recent Multimodal Large Language Models (MLLMs) have shown promising potential in this area...
详细信息
Feature-only partition of large graph data in distributed Graph Neural Network (GNN) training offers advantages over commonly adopted graph structure partition, such as minimal graph preprocessing cost and elimination...
详细信息
ISBN:
(数字)9798350383508
ISBN:
(纸本)9798350383515
Feature-only partition of large graph data in distributed Graph Neural Network (GNN) training offers advantages over commonly adopted graph structure partition, such as minimal graph preprocessing cost and elimination of cross-worker subgraph sampling burdens. Nonetheless, performance bottleneck of GNN training with feature-only partitions still largely lies in the substantial communication overhead due to cross-worker feature fetching. To reduce the communication overhead and expedite distributed training, we first investigate and answer two key questions on convergence behaviors of GNN model in feature-partition based distribute GNN training: 1) As no worker holds a complete copy of each feature, can gradient exchange among workers compensate for the information loss due to incomplete local features? 2) If the answer to the first question is negative, is feature fetching in every training iteration of the GNN model necessary to ensure model convergence? Based on our theoretical findings on these questions, we derive an optimal communication plan that decides the frequency for feature fetching during the training process, taking into account bandwidth levels among workers and striking a balance between model loss and training time. Extensive evaluation demonstrates consistent results with our theoretical analysis, and the effectiveness of our proposed design.
Over the past decade, various methods for detecting side-channel leakage have been proposed and proven to be effective against CPU side-channel attacks. These methods are valuable in assisting developers to identify a...
详细信息
ISBN:
(数字)9798350341058
ISBN:
(纸本)9798350341065
Over the past decade, various methods for detecting side-channel leakage have been proposed and proven to be effective against CPU side-channel attacks. These methods are valuable in assisting developers to identify and patch side-channel vulnerabilities. Nevertheless, recent research has revealed the feasibility of exploiting side-channel vulnerabilities to steal sensitive information from GPU applications, which are beyond the reach of previous side-channel detection methods. Therefore, in this paper, we conduct an in-depth examination of various GPU features and present Owl, a novel side-channel detection tool targeting CUDA applications on NVIDIA GPUs. Owl is designed to detect and locate side-channel leakage in various types of CUDA applications. When tracking the execution of CUDA applications, we design a hierarchical tracing scheme and extend the A-DCFG (Attributed Dynamic Control Flow Graph) to address the massively parallel execution in CUDA, ensuring Owl's detection scalability. After completing the initial assessment and filtering, we conduct statistical tests on the differences in program traces to determine whether they are indeed caused by input variations, subsequently facilitating the positioning of side-channel leaks. We evaluate Owl's capability to detect side-channel leaks by testing it on Libgpucrypto, PyTorch, and nvJPEG. Meanwhile, we verify that our solution effectively handles a large number of threads. Owl has successfully identified hundreds of leaks within these applications. To the best of our knowledge, we are the first to implement side-channel leakage detection for general CUDA applications.
暂无评论