A novel framework for parallel subgraph isomorphism on GPUs is proposed, named GPUSI, which consists of GPU region exploration and GPU subgraph matching. The GPUSI iteratively enumerates subgraph instances and solves ...
详细信息
A novel framework for parallel subgraph isomorphism on GPUs is proposed, named GPUSI, which consists of GPU region exploration and GPU subgraph matching. The GPUSI iteratively enumerates subgraph instances and solves the subgraph isomorphism in a divide-and-conquer fashion. The framework completely relies on the graph traversal, and avoids the explicit join operation. Moreover, in order to improve its performance, a task-queue based method and the virtual-CSR graph structure are used to balance the workload among warps, and warp-centric programming model is used to balance the workload among threads in a warp. The prototype of GPUSI is implemented, and comprehensive experiments of various graph isomorphism operations are carried on diverse large graphs. The experiments clearly demonstrate that GPUSI has good scalability and can achieve speed-up of 1.4–2.6 compared to the state-of-the-art solutions.
Nowadays open source software becomes highly popular and is of great importance for most software engi- neering activities. To facilitate software organization and re- trieval, tagging is extensively used in open sour...
详细信息
Nowadays open source software becomes highly popular and is of great importance for most software engi- neering activities. To facilitate software organization and re- trieval, tagging is extensively used in open source communi- ties. However, finding the desired software through tags in these communities such as Freecode and ohloh is still chal- lenging because of tag insufficiency. In this paper, we propose TRG (tag recommendation based on semantic graph), a novel approach to discovering and enriching tags of open source software. Firstly, we propose a semantic graph to model the semantic correlations between tags and the words in software descriptions. Then based on the graph, we design an effec- tive algorithm to recommend tags for software. With com- prehensive experiments on large-scale open source software datasets by comparing with several typical related works, we demonstrate the effectiveness and efficiency of our method in recommending proper tags.
The scale of global data center market has been explosive in recent years. As the market grows, the demand for fast provisioning of the virtual resources to support elas- tic, manageable, and economical computing over...
详细信息
The scale of global data center market has been explosive in recent years. As the market grows, the demand for fast provisioning of the virtual resources to support elas- tic, manageable, and economical computing over the cloud becomes high. Fast provisioning of large-scale virtual ma- chines (VMs), in particular, is critical to guarantee quality of service (QoS). In this paper, we systematically review the existing VM provisioning schemes and classify them in three main categories. We discuss the features and research status of each category, and introduce two recent solutions, VMThunder and VMThunder+, both of which can provision hundreds of VMs in seconds.
In this paper an effective memory-processor integrated architecture, called memory based processor array for artificial neural networks (MPAA), is proposed. The MPAA can be easily integrated into any host system via m...
详细信息
Breadth-first search(BFS) is an important kernel for graph traversal and has been used by many graph processing applications. Extensive studies have been devoted in boosting the performance of BFS. As the most effecti...
详细信息
Breadth-first search(BFS) is an important kernel for graph traversal and has been used by many graph processing applications. Extensive studies have been devoted in boosting the performance of BFS. As the most effective solution, GPU-acceleration achieves the state-of-the-art result of 3.3×109 traversed edges per second on a NVIDIA Tesla C2050 GPU. A novel vertex frontier based GPU BFS algorithm is proposed, and its main features are three-fold. Firstly, to obtain a better workload balance for irregular graphs, a virtual-queue task decomposition and mapping strategy is introduced for vertex frontier expanding. Secondly, a global deduplicate detection scheme is proposed to remove reduplicative vertices from vertex frontier effectively. Finally, a GPU-based bottom-up BFS approach is employed to process large frontier. The experimental results demonstrate that the algorithm can achieve 10% improvement over the state-of-the-art method on diverse graphs. Especially, it exhibits 2-3 times speedup on low-diameter and scale-free graphs over the state-of-the-art on a NVIDIA Tesla K20 c GPU, reaching a peak traversal rate of 11.2×109 edges/s.
Searching in large-scale unstructured peer-to-peer networks is challenging due to the lack of effective hint information to guide queries. In this paper, we propose POP, a parallel, cOllaborative and Probabilistic sea...
详细信息
Nowadays, cloud providers of 'Infrastructure as a service' require datacenter networks to support virtualization and multi-tenancy at large scale, while it brings a grand challenge to datacenters. Traditional ...
详细信息
The article proposes a selective compressed memory system (SCMS) focusing on a compressed cache architecture, in which only data blocks with good compression efficiency are compressed selectively and all compressed bl...
详细信息
Determinism is very useful to multithreaded programs in debugging, testing, etc. Many deterministic ap- proaches have been proposed, such as deterministic multithreading (DMT) and deterministic replay. However, thes...
详细信息
Determinism is very useful to multithreaded programs in debugging, testing, etc. Many deterministic ap- proaches have been proposed, such as deterministic multithreading (DMT) and deterministic replay. However, these sys- tems either are inefficient or target a single purpose, which is not flexible. In this paper, we propose an efficient and flexible deterministic framework for multithreaded programs. Our framework implements determinism in two steps: relaxed determinism and strong determinism. Relaxed determinism solves data races eificiently by using a proper weak memory consistency model. After that, we implement strong determinism by solving lock contentions deterministically. Since we can apply different approaches for these two steps independently, our framework provides a spectrum of deterministic choices, including nondeterministic system (fast), weak deterministic system (fast and conditionally deterministic), DMT system, and deternfinistic replay system. Our evaluation shows that the DMT configuration of this framework could even outperform a state-of-the-art DMT system.
This paper proposes a Risk-Averse Just-In-Time (RAJIT) operation scheme for Ammonia-Hydrogen-based Micro-Grids (AHMGs) to boost electricity-hydrogen-ammonia coupling under uncertainties. First, an off-grid AHMG model ...
详细信息
暂无评论