Indexing is one of the key components of any search tool to be optimized for searching documents. Among the existing indexing techniques, inverted indexing is one of the best methods used at a larger scale for various...
详细信息
Indexing is one of the key components of any search tool to be optimized for searching documents. Among the existing indexing techniques, inverted indexing is one of the best methods used at a larger scale for various applications. Under this method, the index is designed using a signature file, hash tree and B-tree to retrieve the required document in efficient time. B-tree is popular due to its searching efficiency, but its performance degrades with increasing data set size. The wavelet tree has become a popular and versatile data structure in the last decade, used in various domains such as sequences, indexing, compression, and grid-point with surprising results. This study proposes a parallel wavelet tree algorithm with hybridization of the Map-Reduce concept to construct an index for textual search. The proposed algorithm reduces the index construction time considerably. Experiments show that the proposed algorithm takes a reasonable trade-off with existing indexing approaches. For large data sets, index construction time has been reduced with respect to other existing state-of-art schemes. Also, results show that the algorithm performs well when the data-set scales up to up-to-the full utilization of available cores. It is possible due to the use of multiple threads working in parallel. Our experiment demonstrated consistent performance with 2-core, 4-core, 8-core, 12-core and results of 16-core show increase in index construction time due to parallel overhead when the data-set in not sufficiently large.
String indexes such as the suffix array (SA) and the closely related longest common prefix (LCP) array are fundamental objects in bioinformatics and have a wide variety of applications. Despite their importance in pra...
详细信息
Approximate nearest neighbor (ANN) search on high-dimensional data is a fundamental operation in many applications. In this paper, we study massive queries of ANN (MQ-ANN) search, which deals with a large number of qu...
详细信息
Approximate nearest neighbor (ANN) search on high-dimensional data is a fundamental operation in many applications. In this paper, we study massive queries of ANN (MQ-ANN) search, which deals with a large number of queries simultaneously. To improve the throughput, we combine the parallel capacity of multi-core CPUs and the filtering power of the state-of-the-art index methods, i.e., proximity graphs. However, there are no solutions that exploit proximity graphs to handle MQ-ANN in parallel, except the one called query view, which simply assigns each query to a hardware thread but suffers from numerous cache misses. As the first attempt, we design efficient methods for MQ-ANN with proximity graphs and propose a novel scheduling mechanism called bridge view, which shares the same data access across multiple queries in order to reduce cache misses. Moreover, we extend our method to deal with MQ-ANN on large-scale data sets (e.g. 10(8) points). Finally, we conduct extensive experiments on real data sets to demonstrate the advantages of our method. According to our experimental results, bridge view significantly outperforms query view in various settings. In particular, bridge view with 8 hardware threads even outperforms query view with 24 hardware threads.
In this work, we present a constant-round algorithm for the 2-ruling set problem in the Congested Clique model. As a direct consequence, we obtain a constant round algorithm in the MPC model with linear space-per-mach...
详细信息
Innovations in powerful high-performance computing (HPC) architecture are enabling high-fidelity whole-core neutron transport simulations at reasonable time. Especially, the currently fashionable heterogeneous archite...
详细信息
Mesh simplification is a fundamental problem in geometry processing. Since general simplification algorithms are difficult to parallelize, the main challenge is to process meshes of tens of millions of faces with fast...
详细信息
In this paper, we propose a parallel strongly connected components (SCC) implementation that is efficient on a wide range of graphs. Our speedup comes from two novel techniques: vertical granularity control (VGC) and ...
详细信息
An edge switch is an operation on a network (graph) where two edges are selected randomly and one of their end vertices are swapped with each other. Usually, a sequence of these operations are performed to generate ne...
详细信息
ISBN:
(纸本)9781479956180
An edge switch is an operation on a network (graph) where two edges are selected randomly and one of their end vertices are swapped with each other. Usually, a sequence of these operations are performed to generate network perturbations having the same degree sequence of the original network. Edge switch operations have important applications in graph theory and network analysis, such as in generating random networks with a given degree sequence, modeling and analyzing dynamic networks (e.g., peer-to-peer networks), studying various dynamic phenomena over a network (e.g., disease dynamics over a social contact network). The growth of real-world networks motivates the need to develop efficient parallel algorithms for performing a large sequence of edge switch operations. The dependencies among successive edge switch operations and the requirement of keeping the graph simple (i.e., no self-loops or parallel edges) as the edges are switched lead to significant challenges in designing a parallel algorithm. Addressing these challenges requires complex synchronization and communication among the processors. In this paper, we present a distributed memory parallel algorithm for switching edges in massive networks (networks with billions of edges) and achieve a speedup factor of 85 with 1024 processors. One of the steps in our edge switch algorithm requires the computation of multinomial random variables in parallel. The paper presents the first non-trivial parallel algorithm for the problem. The algorithm achieves a speedup of 925 using 1024 processors.
The calculation of flow accumulation is one of the tasks in digital terrain analysis that is not easy to parallelize. The aim of this work was to develop new, faster ways to calculate flow accumulation and achieve sho...
详细信息
The calculation of flow accumulation is one of the tasks in digital terrain analysis that is not easy to parallelize. The aim of this work was to develop new, faster ways to calculate flow accumulation and achieve shorter execution times than popular software tools for this purpose. We prepared six implementations of algorithms based on both top-down and bottom-up approaches and compared their performance using 118 different data sets (including 59 subcatchments and 59 full frames) of various sizes but the same area and resolution. Our results clearly show that the parallel top-down algorithm (without the use of OpenMP tasks) is the most suitable implementation for flow accumulation calculations of all we have tested. The mean and median execution times of this algorithm are the shortest in all cases studied. The implementation is characterized by high speedups. The execution times of the parallel top-down implementation are two orders of magnitude shorter compared to the Flow Accumulation tool from ArcGIS Desktop. This is important, considering the performance of popular GIS platforms, where it takes hours to perform the same kind of operations with the use of similar equipment.
暂无评论