In this paper, we present an Advanced Vector Extensions (AVX) accelerated method for a bit-parallel algorithm that realizes fast string search for maximizing stable search throughput. An advantage of our method is tha...
详细信息
In this paper, we present an Advanced Vector Extensions (AVX) accelerated method for a bit-parallel algorithm that realizes fast string search for maximizing stable search throughput. An advantage of our method is that it accelerates string search by regularizing both control flow and data structures. This regularization facilitates the exploitation of the latest vector instruction set to achieve efficient parallel search of multiple patterns of different lengths. We use AVX instructions to increase search throughput per CPU core and employ OpenMP directives to realize data-parallel search of strings. As a result, we found that our data structure doubled search throughput as compared with a previous bit-parallel approach that used a data structure for patterns of the same length. We also found that our method achieved stable search throughput for arbitrary data if the pattern size is large, but small enough to fit into a word. Some experimental results are provided to understand the advantage and disadvantage of our method with a comparison to Aho-Corasick based methods. We believe that our method is useful for large genome texts with many partial matches. (C) 2014 Elsevier Inc. All rights reserved.
Approximate string matching has been widely used in many areas, such as web searching, and deoxyribonucleic acid sequence matching, etc. Approximate string matching allows difference between a string and a pattern cau...
详细信息
ISBN:
(纸本)9780769553191
Approximate string matching has been widely used in many areas, such as web searching, and deoxyribonucleic acid sequence matching, etc. Approximate string matching allows difference between a string and a pattern caused by insertion, deletion and substitution. Because approximate string matching is a data-intensive task, accelerating approximate string matching has become crucial for processing big data. In this paper, we propose a hierarchical parallelism approach to accelerate the bit-parallel algorithm on NVIDIA GPUs. A data parallelism approach is used to accelerate the kernel of the bit-parallel algorithm while a task parallelism approach is used to overlap data transfer with kernel computation. In addition, we propose to use hashing to reduce the memory usage and achieve 98.4% of memory reduction. The experimental results show that the bit-parallel algorithm performed on GPUs achieves 7 to 11 times faster than the multithreaded CPU implementation. Compared to the state-of-the-art approaches, the proposed approach achieves 2.8 to 104.8 times improvement.
Semi-structured data, such as JSON, are fundamental to the Web and document data stores. Streaming analytics on semi-structured data combines parsing and query evaluation into one pass to avoid generating parse trees....
详细信息
ISBN:
(纸本)9781450392051
Semi-structured data, such as JSON, are fundamental to the Web and document data stores. Streaming analytics on semi-structured data combines parsing and query evaluation into one pass to avoid generating parse trees. Though promising, its conventional design requires to parse the data stream in detail character by character, which limits the efficiency of streaming analytics. This work reveals a wide range of opportunities to fast-forward the streaming over certain data substructures irrelevant to the query evaluation. However, identifying these substructures itself may need detailed parsing. To resolve this dilemma, this work designs a highly bit-parallel solution that intensively utilizes bitwise and SIMD operations to identify the irrelevant substructures during the streaming. It includes a new streaming model Drecursive-descent streaming, for an easy adoption of fast-forward optimizations, a concept-structural intervals, for partitioning the data stream, and a group of bit-parallel algorithms implementing various fast-forward cases. The solution is implemented as a JSON streaming framework, called JSONSki. It offers a set of APIs that can be invoked during the streaming to dynamically fast-forward over different cases of irrelevant substructures. Evaluation using real-world datasets and standard path queries shows that JSONSki can achieve significant speedups over the state-of-the-art JSON processing tools while taking a minimum memory footprint.
In this study, to substantially improve the runtimes of exact and approximate string matching algorithms, we propose a tribrid parallel method for bit-parallel algorithms such as the Shift-Or andWu-Manber algorithms. ...
详细信息
In this study, to substantially improve the runtimes of exact and approximate string matching algorithms, we propose a tribrid parallel method for bit-parallel algorithms such as the Shift-Or andWu-Manber algorithms. Our underlying idea is to interpret bit-parallel algorithms as inclusive-scan operations, which allow these bit-parallel algorithms to run efficiently on a graphics processing unit (GPU);we achieve this speed-up here because inclusive-scan operations not only eliminate duplicate searches between threads but also realize a GPU-friendly memory access pattern that maximizes memory read/write throughput. To realize our ideas, we first define two binary operators and then present a proof regarding the associativity of these operators, which is necessary for the parallelization of the inclusive-scan operations. Finally, we integrate the inclusive-scan scheme into a previous segmentation-based scheme to maximize search throughput, identifying the best tradeoff point between synchronization cost and duplicate work. Through our experiments, we compared our proposed method with previous segmentation-based methods and indexing-based sequence aligners. For online string matching, our proposed method performed 6.7-16.7 times faster than previous methods, achieving a search throughput of up to 1.88 terabits per second (Tbps) on a GeForce GTX TITAN X GPU. We therefore conclude that our proposed method is quite effective for decreasing the runtimes of online string matching of short patterns.
The problem of finding a constrained longest common subsequence (CLCS) for the sequences A and B with respect to the sequence P was introduced recently. Its goal is to find a longest subsequence C of A and B such that...
详细信息
The problem of finding a constrained longest common subsequence (CLCS) for the sequences A and B with respect to the sequence P was introduced recently. Its goal is to find a longest subsequence C of A and B such that P is a subsequence of C. Most of the algorithms solving the CLCS problem are based on dynamic programming. bit-parallelism is a technique of using single bits in a machine word for concurrent computation. We propose the first bit-parallel algorithm computing a CLCS and/or its length which outperforms the other known algorithms in terms of speed.
Approximate string matching has been widely used in many applications, including deoxyribonucleic acid sequence searching, spell checking, text mining, and spam filters. The method is designed to find all locations of...
详细信息
ISBN:
(纸本)9780769557854
Approximate string matching has been widely used in many applications, including deoxyribonucleic acid sequence searching, spell checking, text mining, and spam filters. The method is designed to find all locations of strings that approximately match a pattern in accordance with the number of insertion, deletion, and substitution operations. Among the proposed algorithms, the bit-parallel algorithms are considered to be the best and highly efficient algorithms. However, the traditional bit-parallel algorithms lacks the ability of identifying the start and end positions of a matched pattern. Furthermore, acceleration of the bit-parallel algorithms has become a crucial issue for processing big data nowadays. In this paper, we propose two kinds of parallel location-aware algorithms called data-segmented parallelism and high-degree parallelism as means to accelerate approximate string matching using graphic processing units. Experimental results show that the high-degree parallelism on GPUs achieves significant improvement in system and kernel throughputs compared to CPU counterparts. Compared to stateo-f-the-art approaches, the proposed high-degree parallelism achieves 11 to 105 times improvement.
Train-ground communication data that are generated during the operation of high-speed railway train control system are key information which reflects the safety control logic and function of high-speed railways. Since...
详细信息
ISBN:
(纸本)9783642224553
Train-ground communication data that are generated during the operation of high-speed railway train control system are key information which reflects the safety control logic and function of high-speed railways. Since the possible existence of defects in the devices of train control system, the communication data may contain some errors (e.g. missing, redundant or wrong messages). As a result, error location and diagnosis of these communication data is an important part of function test in train-ground communication. These problems can be abstracted into an approximate string matching problem which has seldom been studied in previous research. This paper extends the bit-parallel algorithm for approximate regular expression matching problem and proposes an online method to locate and diagnose errors. Finally, a case study is presented and the results indicate the effectiveness of this method.
Train-ground communication data that are generated during the operation of high-speed railway train control system are key information which reflects the safety control logic and function of high-speed *** the possibl...
详细信息
Train-ground communication data that are generated during the operation of high-speed railway train control system are key information which reflects the safety control logic and function of high-speed *** the possible existence of defects in the devices of train control system,the communication data may contain some errors(***,redundant or wrong messages).As a result,error location and diagnosis of these communication data is an important part of function test in train-ground *** problems can be abstracted into an approximate string matching problem which has seldom been studied in previous *** paper extends the bitparallelalgorithm for approximate regular expression matching problem and proposes an online method to locate and diagnose ***,a case study is presented and the results indicate the effectiveness of this method.
We propose a new variant of the bit-parallel NFA of Baeza-Yates and Navarro (BPD) for approximate string matching [R. Baeza-Yates, G. Navarro, Faster approximate string matching, algorithmica 23 (1999) 127-158]. BPD i...
详细信息
We propose a new variant of the bit-parallel NFA of Baeza-Yates and Navarro (BPD) for approximate string matching [R. Baeza-Yates, G. Navarro, Faster approximate string matching, algorithmica 23 (1999) 127-158]. BPD is one of the most practical approximate string matching algorithms under moderate pattern lengths and error levels [G. Myers, A fast bit-vector algorithm for approximate string matching based oil dynamic programming, J. ACM 46 (3) 1989 395-415: G. Navarro, M. Raffinot, Flexible Pattern Matching in Strings-Practical On-line Search algorithms for Texts and Biological Sequences, Cambridge University Press, Cambridge. UK, 2002]. Given a length-m pattern and an error threshold k, the Original BPD requires (m - k)(k + 2) bits of space to represent ail NFA with (in - k)(k + I) states. In this paper we remove redundancy from the original NFA representation. Our variant requires (in - k)(k + 1) bits of space, which is optimal in the sense that exactly one bit per state is used. The space efficiency is achieved by using ail alternative, but equally or even more efficient. simulation algorithm for the bit-parallel NFA. We also present experimental results to compare our modified NFA against the original BPD and its main competitors. Our new variant is more efficient than the original BPD, and it hence takes over/extends the role of the original BPD as one of the most practical approximate string matching algorithms under moderate values of k and m. (c) 2008 Elsevier B.V. All rights reserved.
暂无评论