检索结果-内蒙古大学图书馆

A bit-parallel algorithm for searching multiple patterns with various lengths

JOURNAL OF parallel AND DISTRIBUTED COMPUTING 2015年 76卷 49-57页

作者： Kusudo, Ko Ino, Fumihiko Hagihara, Kenichi Osaka Univ Grad Sch Informat Sci & Technol Suita Osaka 5650871 Japan Fujitsu Ltd Minato Ku Tokyo 1057123 Japan

In this paper, we present an Advanced Vector Extensions (AVX) accelerated method for a bit-parallel algorithm that realizes fast string search for maximizing stable search throughput. An advantage of our method is that it accelerates string search by regularizing both control flow and data structures. This regularization facilitates the exploitation of the latest vector instruction set to achieve efficient parallel search of multiple patterns of different lengths. We use AVX instructions to increase search throughput per CPU core and employ OpenMP directives to realize data-parallel search of strings. As a result, we found that our data structure doubled search throughput as compared with a previous bit-parallel approach that used a data structure for patterns of the same length. We also found that our method achieved stable search throughput for arbitrary data if the pattern size is large, but small enough to fit into a word. Some experimental results are provided to understand the advantage and disadvantage of our method with a comparison to Aho-Corasick based methods. We believe that our method is useful for large genome texts with many partial matches. (C) 2014 Elsevier Inc. All rights reserved.

关键词： String search bit-parallel algorithm Acceleration AVX

来源：评论

学校读者我要写书评

暂无评论

Hierarchical parallelism of bit-parallel algorithm for Approximate String Matching on GPUs

Hierarchical Parallelism of Bit-Parallel Algorithm for Appro...

引用

Symposium on Computer Applications and Communications (SCAC)

作者： Lin, Cheng-Hung Wang, Guan-Hong Huang, Chun-Cheng Natl Taiwan Normal Univ Dept Elect Engn Taipei Taiwan

ISBN: (纸本)9780769553191

Approximate string matching has been widely used in many areas, such as web searching, and deoxyribonucleic acid sequence matching, etc. Approximate string matching allows difference between a string and a pattern caused by insertion, deletion and substitution. Because approximate string matching is a data-intensive task, accelerating approximate string matching has become crucial for processing big data. In this paper, we propose a hierarchical parallelism approach to accelerate the bit-parallel algorithm on NVIDIA GPUs. A data parallelism approach is used to accelerate the kernel of the bit-parallel algorithm while a task parallelism approach is used to overlap data transfer with kernel computation. In addition, we propose to use hashing to reduce the memory usage and achieve 98.4% of memory reduction. The experimental results show that the bit-parallel algorithm performed on GPUs achieves 7 to 11 times faster than the multithreaded CPU implementation. Compared to the state-of-the-art approaches, the proposed approach achieves 2.8 to 104.8 times improvement.

关键词： approximate string matching bit-parallel algorithm graphic processing units

来源：评论

学校读者我要写书评

暂无评论

JSONSki: Streaming Semi-structured Data with bit-parallel Fast-Forwarding 27

JSONSki: Streaming Semi-structured Data with Bit-Parallel Fa...

引用

27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS)

作者： Jiang, Lin Zhao, Zhijia Univ Calif Riverside Riverside CA 92521 USA

ISBN: (纸本)9781450392051

Semi-structured data, such as JSON, are fundamental to the Web and document data stores. Streaming analytics on semi-structured data combines parsing and query evaluation into one pass to avoid generating parse trees. Though promising, its conventional design requires to parse the data stream in detail character by character, which limits the efficiency of streaming analytics. This work reveals a wide range of opportunities to fast-forward the streaming over certain data substructures irrelevant to the query evaluation. However, identifying these substructures itself may need detailed parsing. To resolve this dilemma, this work designs a highly bit-parallel solution that intensively utilizes bitwise and SIMD operations to identify the irrelevant substructures during the streaming. It includes a new streaming model Drecursive-descent streaming, for an easy adoption of fast-forward optimizations, a concept-structural intervals, for partitioning the data stream, and a group of bit-parallel algorithms implementing various fast-forward cases. The solution is implemented as a JSON streaming framework, called JSONSki. It offers a set of APIs that can be invoked during the streaming to dynamically fast-forward over different cases of irrelevant substructures. Evaluation using real-world datasets and standard path queries shows that JSONSki can achieve significant speedups over the state-of-the-art JSON processing tools while taking a minimum memory footprint.

关键词： JSON Parser Semi-structured Data SIMD bit-parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

parallelizing Exact and Approximate String Matching via Inclusive Scan on a GPU

引用

IEEE TRANSACTIONS ON parallel AND DISTRIBUTED SYSTEMS 2017年第7期28卷 1989-2002页

作者： Mitani, Yasuaki Ino, Fumihiko Hagihara, Kenichi DWANGO Corp Ltd Dev Head Off Chuo Ku 4-12-15 Ginza Tokyo 1040061 Japan Osaka Univ Grad Sch Informat Sci & Technol 1-5 Yamadaoka Suita Osaka 5650871 Japan

In this study, to substantially improve the runtimes of exact and approximate string matching algorithms, we propose a tribrid parallel method for bit-parallel algorithms such as the Shift-Or andWu-Manber algorithms. Our underlying idea is to interpret bit-parallel algorithms as inclusive-scan operations, which allow these bit-parallel algorithms to run efficiently on a graphics processing unit (GPU);we achieve this speed-up here because inclusive-scan operations not only eliminate duplicate searches between threads but also realize a GPU-friendly memory access pattern that maximizes memory read/write throughput. To realize our ideas, we first define two binary operators and then present a proof regarding the associativity of these operators, which is necessary for the parallelization of the inclusive-scan operations. Finally, we integrate the inclusive-scan scheme into a previous segmentation-based scheme to maximize search throughput, identifying the best tradeoff point between synchronization cost and duplicate work. Through our experiments, we compared our proposed method with previous segmentation-based methods and indexing-based sequence aligners. For online string matching, our proposed method performed 6.7-16.7 times faster than previous methods, achieving a search throughput of up to 1.88 terabits per second (Tbps) on a GeForce GTX TITAN X GPU. We therefore conclude that our proposed method is quite effective for decreasing the runtimes of online string matching of short patterns.

关键词： String matching bit-parallel algorithm inclusive scan shift-or algorithm Wu-Manber algorithm GPU

来源：评论

学校读者我要写书评

暂无评论

bit-parallel algorithm for the Constrained Longest Common Subsequence Problem

引用

FUNDAMENTA INFORMATICAE 2010年第4期99卷 409-433页

作者： Deorowicz, Sebastian Silesian Tech Univ Inst Informat PL-44100 Gliwice Poland

The problem of finding a constrained longest common subsequence (CLCS) for the sequences A and B with respect to the sequence P was introduced recently. Its goal is to find a longest subsequence C of A and B such that P is a subsequence of C. Most of the algorithms solving the CLCS problem are based on dynamic programming. bit-parallelism is a technique of using single bits in a machine word for concurrent computation. We propose the first bit-parallel algorithm computing a CLCS and/or its length which outperforms the other known algorithms in terms of speed.

关键词： constrained longest common subsequence longest common subsequence dynamic programming bit-parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

High-Performance parallel Location-Aware algorithms for Approximate String Matching on GPUs 21

High-Performance Parallel Location-Aware Algorithms for Appr...

引用

21st IEEE International Conference on parallel and Distributed Systems ICPADS

作者： Lin, Cheng-Hung Huang, Chun-Cheng Natl Taiwan Normal Univ Dept Elect Engn Taipei Taiwan

ISBN: (纸本)9780769557854

Approximate string matching has been widely used in many applications, including deoxyribonucleic acid sequence searching, spell checking, text mining, and spam filters. The method is designed to find all locations of strings that approximately match a pattern in accordance with the number of insertion, deletion, and substitution operations. Among the proposed algorithms, the bit-parallel algorithms are considered to be the best and highly efficient algorithms. However, the traditional bit-parallel algorithms lacks the ability of identifying the start and end positions of a matched pattern. Furthermore, acceleration of the bit-parallel algorithms has become a crucial issue for processing big data nowadays. In this paper, we propose two kinds of parallel location-aware algorithms called data-segmented parallelism and high-degree parallelism as means to accelerate approximate string matching using graphic processing units. Experimental results show that the high-degree parallelism on GPUs achieves significant improvement in system and kernel throughputs compared to CPU counterparts. Compared to stateo-f-the-art approaches, the proposed high-degree parallelism achieves 11 to 105 times improvement.

关键词： approximate string matching bit-parallel algorithm graphic processing units Levenshtein distance nondeterministic finite automaton parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

An Error Location and Diagnosis Method for Communication Test of High-Speed Railway Train Control System Based on String Pattern Matching

An Error Location and Diagnosis Method for Communication Tes...

引用

International Conference on Computer Science and Education (CSE 2011)

作者： Huang, Liu Dong, Wei Ji, Yindong Sun, Xinya Tsinghua Univ Dept Automat Tsinghua Natl Lab Informat Sci & Technol Beijing 100084 Peoples R China

ISBN: (纸本)9783642224553

Train-ground communication data that are generated during the operation of high-speed railway train control system are key information which reflects the safety control logic and function of high-speed railways. Since the possible existence of defects in the devices of train control system, the communication data may contain some errors (e.g. missing, redundant or wrong messages). As a result, error location and diagnosis of these communication data is an important part of function test in train-ground communication. These problems can be abstracted into an approximate string matching problem which has seldom been studied in previous research. This paper extends the bit-parallel algorithm for approximate regular expression matching problem and proposes an online method to locate and diagnose errors. Finally, a case study is presented and the results indicate the effectiveness of this method.

关键词： train-ground communication test error location and diagnosis approximate regular expression matching bit-parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

An Error Location and Diagnosis Method for Communication Test of High-Speed Railway Train Control System Based on String Pattern Matching

An Error Location and Diagnosis Method for Communication Tes...

引用

Advances in Information Technology and Education

作者： Liu Huang Wei Dong Yindong Ji Xinya Sun Tsinghua National Laboratory for Information Science and Technology Department of Automation Tsinghua University IEEE

Train-ground communication data that are generated during the operation of high-speed railway train control system are key information which reflects the safety control logic and function of high-speed *** the possible existence of defects in the devices of train control system,the communication data may contain some errors(***,redundant or wrong messages).As a result,error location and diagnosis of these communication data is an important part of function test in train-ground *** problems can be abstracted into an approximate string matching problem which has seldom been studied in previous *** paper extends the bitparallel algorithm for approximate regular expression matching problem and proposes an online method to locate and diagnose ***,a case study is presented and the results indicate the effectiveness of this method.

关键词： train-ground communication test error location and diagnosis approximate regular expression matching bit-parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

Improving the bit-parallel NFA of Baeza-Yates and Navarro for approximate string matching

引用

INFORMATION PROCESSING LETTERS 2008年第5期108卷 313-319页

作者： Hyyro, Heikki Univ Tampere Dept Comp Sci FIN-33101 Tampere Finland

We propose a new variant of the bit-parallel NFA of Baeza-Yates and Navarro (BPD) for approximate string matching [R. Baeza-Yates, G. Navarro, Faster approximate string matching, algorithmica 23 (1999) 127-158]. BPD is one of the most practical approximate string matching algorithms under moderate pattern lengths and error levels [G. Myers, A fast bit-vector algorithm for approximate string matching based oil dynamic programming, J. ACM 46 (3) 1989 395-415: G. Navarro, M. Raffinot, Flexible Pattern Matching in Strings-Practical On-line Search algorithms for Texts and Biological Sequences, Cambridge University Press, Cambridge. UK, 2002]. Given a length-m pattern and an error threshold k, the Original BPD requires (m - k)(k + 2) bits of space to represent ail NFA with (in - k)(k + I) states. In this paper we remove redundancy from the original NFA representation. Our variant requires (in - k)(k + 1) bits of space, which is optimal in the sense that exactly one bit per state is used. The space efficiency is achieved by using ail alternative, but equally or even more efficient. simulation algorithm for the bit-parallel NFA. We also present experimental results to compare our modified NFA against the original BPD and its main competitors. Our new variant is more efficient than the original BPD, and it hence takes over/extends the role of the original BPD as one of the most practical approximate string matching algorithms under moderate values of k and m. (c) 2008 Elsevier B.V. All rights reserved.

关键词： algorithms Edit distance Approximate string matching bit-parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：