Online exact string matching consists in locating all the occurrences of a pattern in a text where only the pattern can be preprocessed. Classical online exact string matching algorithms scan the text from start to en...
详细信息
Some people implement "pattern" and "best practices" without analysing its efficiency on their projects. Consequently, our goal in this article is to convince software developers that it is worth t...
详细信息
Cartesian tree matching is the problem of finding all substrings of a given text which have the same Cartesian trees as that of a given pattern. So far there is one linear-time solution for Cartesian tree matching, wh...
详细信息
In the HL 7 standard medical information system, it is necessary to establish the HL 7 message validation mechanism to ensure the accuracy, legitimacy and completeness of the medical information exchange. The HL 7 mes...
详细信息
Lempel-Ziv 1977 (LZ77) parsing, matching statistics and the Burrows-Wheeler Transform (BWT) are all fundamental elements of stringology. In a series of recent papers, Policriti and Prezza (DCC 2016 and Algorithmica, C...
详细信息
With high increasing speed of today’s computer networks which affects the performance of security issues in terms of detection speed, the traditional security tools such as firewall is insufficient to protect the net...
详细信息
The stringsearching task can be classified as a classic information processing task. Users either encounter the solution of this task while working with text processors or browsers, employing standard built-in tools,...
详细信息
We explore the benefits of parallelizing 7 state-of-the-art string matching algorithms. Using SIMD and multi-threading techniques we achieve a significant performance improvement of up to 43.3x over reference implemen...
详细信息
ISBN:
(纸本)9783319589435;9783319589428
We explore the benefits of parallelizing 7 state-of-the-art string matching algorithms. Using SIMD and multi-threading techniques we achieve a significant performance improvement of up to 43.3x over reference implementations and a speedup of up to 16.7x over the string matching program grep. We evaluate our implementations on the smart-corpora and the full human genome data set. We show scalability over number of threads and impact of pattern length.
Although real-world text datasets, such as DNA sequences, are far from being uniformly random, average-case string searching algorithms perform significantly better than worst-case ones in most applications of interes...
详细信息
Parallel corpus for multilingual NLP tasks, deep learning applications like Statistical Machine Translation Systems is very important. The parallel corpus of Hindi-English language pair available for news translation ...
详细信息
暂无评论