A novel framework for parallel subgraph isomorphism on GPUs is proposed, named GPUSI, which consists of GPU region exploration and GPU subgraph matching. The GPUSI iteratively enumerates subgraph instances and solves ...
详细信息
A novel framework for parallel subgraph isomorphism on GPUs is proposed, named GPUSI, which consists of GPU region exploration and GPU subgraph matching. The GPUSI iteratively enumerates subgraph instances and solves the subgraph isomorphism in a divide-and-conquer fashion. The framework completely relies on the graph traversal, and avoids the explicit join operation. Moreover, in order to improve its performance, a task-queue based method and the virtual-CSR graph structure are used to balance the workload among warps, and warp-centric programming model is used to balance the workload among threads in a warp. The prototype of GPUSI is implemented, and comprehensive experiments of various graph isomorphism operations are carried on diverse large graphs. The experiments clearly demonstrate that GPUSI has good scalability and can achieve speed-up of 1.4–2.6 compared to the state-of-the-art solutions.
This paper presents a novel algorithm to detect null pointer dereference errors. The algorithm utilizes both of the must and may alias information in a compact way to improve the precision of the detection. Using may ...
详细信息
GPGPUs are increasingly being used to as performance accelerators for HPC (High Performance Computing) applications in CPU/GPU heterogeneous computing systems, including TianHe-1A, the world's fastest supercomputer...
详细信息
GPGPUs are increasingly being used to as performance accelerators for HPC (High Performance Computing) applications in CPU/GPU heterogeneous computing systems, including TianHe-1A, the world's fastest supercomputer in the TOP500 list, built at NUDT (national University of Defense Technology) last year. However, despite their performance advantages, GPGPUs do not provide built-in fault-tolerant mechanisms to offer reliability guarantees required by many HPC applications. By analyzing the SIMT (single-instruction, multiple-thread) characteristics of programs running on GPGPUs, we have developed PartialRC, a new checkpoint-based compiler-directed partial recomputing method, for achieving efficient fault recovery by leveraging the phenomenal computing power of GPGPUs. In this paper, we introduce our PartialRC method that recovers from errors detected in a code region by partially re-computing the region, describe a checkpoint-based faulttolerance framework developed on PartialRC, and discuss an implementation on the CUDA platform. Validation using a range of representative CUDA programs on NVIDIA GPGPUs against FullRC (a traditional full-recomputing Checkpoint-Rollback-Restart fault recovery method for CPUs) shows that PartialRC reduces significantly the fault recovery overheads incurred by FullRC, by 73.5% when errors occur earlier during execution and 74.6% when errors occur later on average. In addition, PartialRC also reduces error detection overheads incurred by FullRC during fault recovery while incurring negligible performance overheads when no fault happens.
Searchable encryption allows cloud users to outsource the massive encrypted data to the remote cloud and to search over the data without revealing the sensitive information. Many schemes have been proposed to support ...
详细信息
Searchable encryption allows cloud users to outsource the massive encrypted data to the remote cloud and to search over the data without revealing the sensitive information. Many schemes have been proposed to support the keyword search in a public cloud. However,they have some potential limitations. First,most of the existing schemes only consider the scenario with the single data owner. Second,they need secure channels to guarantee the secure transmission of secret keys from the data owner to data users. Third,in some schemes,the data owner should be online to help data users when data users intend to perform the search,which is *** this paper,we propose a novel searchable scheme which supports the multi-owner keyword search without secure channels. More than that,our scheme is a non-interactive solution,in which all the users only need to communicate with the cloud server. Furthermore,the analysis proves that our scheme can guarantee the security even without secure channels. Unlike most existing public key encryption based searchable schemes,we evaluate the performance of our scheme,which shows that our scheme is practical.
In recent years, the problem of lake eutrophication has become increasingly severe. The monitoring and control of cyanobacteria in lakes are of great significance. The information obtained by existing monitoring metho...
详细信息
Test oracles are widely used to verify whether a system under test is running as desired. Since the correctness of real-time systems depends on the logical results of the computation and the time when results are prod...
详细信息
Barrier synchronization and reduction are global operations used frequently in large scale OpenMP programs. To improve OpenMP performance, we present two new directives BARRIER(0) and ALLREDUCTION to extend BARRIER an...
详细信息
Resources over Internet have such intrinsic characteristics as growth, autonomy and diversity, which have brought many challenges to the efficient sharing and comprehensive utilization of these resources. This paper p...
详细信息
Resources over Internet have such intrinsic characteristics as growth, autonomy and diversity, which have brought many challenges to the efficient sharing and comprehensive utilization of these resources. This paper presents a novel approach for the construction of the Internet-based Virtual Computing Environment (iVCE), whose sig- nificant mechanisms are on-demand aggregation and autonomic collaboration. The iVCE is built on the open infrastructure of the Internet and provides harmonious, transparent and integrated services for end-users and applications. The concept of iVCE is presented and its architectural framework is described by introducing three core concepts, i.e., autonomic element, virtual commonwealth and virtual executor. Then the connotations, functions and related key technologies of each components of the architecture are deeply analyzed with a case study, iVCE for Memory.
Predicting network latencies between Internet hosts can efficiently support large-scale Internet applications, e.g., file sharing service and the overlay construction. Several study use the Hyperbolic space to model t...
详细信息
Noncoding RNAs (ncRNAs) have important functional roles in biological processes and have become a central research interest in modern molecular biology. However, how to find ncRNA attracts much more attention since nc...
详细信息
暂无评论