检索结果-内蒙古大学图书馆

Enhanced Indexing and Querying of Trajectories in Road Networks via string algorithms

ACM TRANSACTIONS ON SPATIAL algorithms AND SYSTEMS 2018年第1期4卷 1–41页

作者： Koide, Satoshi Tadokoro, Yukihiro Yoshimura, Takayoshi Xiao, Chuan Ishikawa, Yoshiharu Toyota Cent Res & Dev Labs Inc Nagakute Aichi 4801192 Japan Nagoya Univ Furou Cho Nagoya Aichi 464860 Japan

In this article, we propose a novel indexing and querying method for trajectories constrained in a road network. We aim to provide efficient algorithms for various types of spatiotemporal queries that involve routing in road networks, such as (1) finding moving objects that have traveled along a given path during a given time interval, (2) extracting all paths traveled after a given spatiotemporal context, and (3) enumerating all paths between two locations traveled during a certain time interval. Unlike the existing methods in spatial database research, we employ indexing techniques and algorithms from string processing. This idea is based on the fact that we can represent spatial paths as strings, because trajectories in a network are represented as sequences of road segment IDs. The proposed SNT-index (suffix-array-based network-constrained trajectory index) introduces two novel concepts to trajectory indexing. The first is FM-index, which is a compact in-memory data structure for pattern matching. The second is an inverse suffix array, which allows the FM-index to be integrated with the temporal information stored in a forest of B+-trees. Thanks to these concepts, we can reduce the number of B+-tree accesses required by the query processing algorithms to a constant number, something that cannot be achieved with existing methods. Although an FM-index is essentially a static index, we also propose a practical method of appending new data to the index. Finally, experiments show that our method can process the target queries for more than 1 million trajectories in a few tens of milliseconds, which is significantly faster than what the baseline algorithms can achieve without string algorithms.

关键词： Spatiotemporal indexing network-constrained trajectories string algorithms

来源：评论

学校读者我要写书评

暂无评论

Missing value replacement in strings and applications

引用

DATA MINING AND KNOWLEDGE DISCOVERY 2025年第2期39卷 1-50页

作者： Bernardini, Giulia Liu, Chang Loukides, Grigorios Marchetti-Spaccamela, Alberto Pissis, Solon P. Stougie, Leen Sweering, Michelle Univ Trieste Dept Math Informat & Geosci Trieste Italy Zhejiang Univ Med Ctr Zhejiang Peoples R China Kings Coll London Dept Informat London England Univ Roma La Sapienza Dept Comp Control & Management Engn Rome Italy CWI Amsterdam Netherlands Vrije Univ Fac Sci Amsterdam Netherlands ERABLE Team Lyon France Vrije Univ Sch Business & econ Amsterdam Netherlands

Missing values arise routinely in real-world sequential (string) datasets due to: (1) imprecise data measurements;(2) flexible sequence modeling, such as binding profiles of molecular sequences;or (3) the existence of confidential information in a dataset which has been deleted deliberately for privacy protection. In order to analyze such datasets, it is often important to replace each missing value, with one or more valid letters, in an efficient and effective way. Here we formalize this task as a combinatorial optimization problem: the set of constraints includes the context of the missing value (i.e., its vicinity) as well as a finite set of user-defined forbidden patterns, modeling, for instance, implausible or confidential patterns;and the objective function seeks to minimize the number of new letters we introduce. Algorithmically, our problem translates to finding shortest paths in special graphs that contain forbidden edges representing the forbidden patterns. Our work makes the following contributions: (1) we design a linear-time algorithm to solve this problem for strings over constant-sized alphabets;(2) we show how our algorithm can be effortlessly applied to fully sanitize a private string in the presence of a set of fixed-length forbidden patterns [Bernardini et al. 2021a];(3) we propose a methodology for sanitizing and clustering a collection of private strings that utilizes our algorithm and an effective and efficiently computable distance measure;and (4) we present extensive experimental results showing that our methodology can efficiently sanitize a collection of private strings while preserving clustering quality, outperforming the state of the art and baselines. To arrive at our theoretical results, we employ techniques from formal languages and combinatorial pattern matching.

关键词： string algorithms Forbidden patterns Missing value replacement string sanitization

来源：评论

学校读者我要写书评

暂无评论

Quantum Divide and Conquer

引用

ACM TRANSACTIONS ON QUANTUM COMPUTING 2025年第2期6卷

作者： Childs, Andrew Kothari, Robin Kovacs-Deak, Matt Sundaram, Aarthi Wang, Daochen Univ Maryland College Pk MD 20742 USA Microsoft Corp Redmond WA USA

The divide-and-conquer framework, used extensively in classical algorithm design, recursively breaks a problem of size n into smaller subproblems (say, a copies of size n/b each), along with some auxiliary work of cost Caux(n), to give a recurrence relation C(n) <= a C(n/b) + Caux(n) for the classical complexity C(n). We describe a quantum divide-and-conquer framework that, in certain cases, yields an analogous recurrence relation CQ(n) <= root aCQ(n/b) + O(Caux Q (n)) that characterizes the quantum query complexity. We apply this framework to obtain near-optimal quantum query complexities for various string problems, such as (i) recognizing the regular language Sigma & lowast;20 & lowast;2 Sigma & lowast;over the alphabet Sigma = {0, 1, 2};(ii) decision versions of string Rotation and string Suffix;and natural parameterized versions of (iii) Longest Increasing Subsequence and (iv) Longest Common Subsequence.

关键词： Quantum computing quantum query complexity divide and conquer string algorithms regular languages

来源：评论

学校读者我要写书评

暂无评论

Quantum Meets Fine-Grained Complexity: Sublinear Time Quantum algorithms for string Problems

引用

ALGORITHMICA 2023年第5期85卷 1251-1286页

作者： Le Gall, Francois Seddighin, Saeed Nagoya Univ Grad Sch Math Chikusa Ku Furocho Nagoya Aichi 4648602 Japan Toyota Technol Inst Chicago 6045 S Kenwood Ave Chicago IL 60637 USA

Longest common substring (LCS), longest palindrome substring (LPS), and Ulam distance (UL) are three fundamental string problems that can be classically solved in near linear time. In this work, we present sublinear time quantum algorithms for these problems along with quantum lower bounds. Our results shed light on a very surprising fact: Although the classic solutions for LCS and LPS are almost identical (via suffix trees), their quantum computational complexities are different. While we give an exact o (root n) time algoritham for LPS, we prove that LCS needs at least time omega(sic) (n(2/3 )) even for 0/1 strings.

关键词： string algorithms Quantum algorithms Sublinear-time algorithms Fine-grained complexity

来源：评论

学校读者我要写书评

暂无评论

Online and Offline algorithms for Counting Distinct Closed Factors via Sliding Suffix Trees 50th

Online and Offline Algorithms for Counting Distinct Closed...

引用

50th International Conference on Current Trends in Theory and Practice of Computer Science, SOFSEM 2025

作者： Mieno, Takuya Takahashi, Shun Seto, Kazuhisa Horiyama, Takashi University of Electro-Communications Chofu Japan Hokkaido University Sapporo Japan

ISBN: (纸本)9783031826962

A string is said to be closed if its length is one, or if it has a non-empty factor that occurs both as a prefix and as a suffix of the string, but does not occur elsewhere. The notion of closed words was introduced by [Fici, WORDS 2011]. Recently, the maximum number of distinct closed factors occurring in a string was investigated by [Parshina and Puzynina, Theor. Comput. Sci. 2024], and an asymptotic tight bound was proved. In this paper, we propose two algorithms to count the distinct closed factors in a string T of length n over an alphabet of size σ. The first algorithm runs in O(nlogσ) time using O(n) space for string T given in an online manner. The second algorithm runs in O(n) time using O(n) space for string T given in an offline manner. Both algorithms utilize suffix trees for sliding windows. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

关键词： closed words sliding suffix trees string algorithms

来源：评论

学校读者我要写书评

暂无评论

Efficient Parallel algorithms for string Comparison 21

Efficient Parallel Algorithms for String Comparison

引用

50th International Conference on Parallel Processing (ICPP)

作者： Mishin, Nikita Berezun, Daniil Tiskin, Alexander St Petersburg State Univ St Petersburg Russia JetBrains Res Prague Czech Republic

ISBN: (纸本)9781450390682

The longest common subsequence (LCS) problem on a pair of strings is a classical problem in string algorithms. Its extension, the semilocal LCS problem, provides a more detailed comparison of the input strings, without any increase in asymptotic running time. Several semi-local LCS algorithms have been proposed previously;however, to the best of our knowledge, none have yet been implemented. In this paper, we explore a new hybrid approach to the semi-local LCS problem. We also propose a novel bit-parallel LCS algorithm. In the experimental part of the paper, we present an implementation of several existing and new parallel LCS algorithms and evaluate their performance.

关键词： string algorithms longest common subsequence semi-local string comparison parallel algorithms divide-and-conquer dynamic programming braid multiplication parallel braid multiplication bitparallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Bidirectional string Anchors for Improved Text Indexing and Top-K Similarity Search

引用

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 2023年第11期35卷 11093-11111页

作者： Loukides, Grigorios Pissis, Solon P. Sweering, Michelle Kings Coll London London WC2R 2LS England CWI NL-1098 XG Amsterdam Netherlands Vrije Univ NL-1081 HV Amsterdam Netherlands

The minimizers sampling mechanism is a popular mechanism for string sampling. However, minimizers sampling mechanisms lack good guarantees on the expected size of their samples for different combinations of their input parameters. Furthermore, indexes constructed over minimizers samples lack good worst-case guarantees for on-line pattern searches. In response, we propose bidirectional string anchors (bd-anchors), a new string sampling mechanism. Given an integer l, our mechanism selects the lexicographically smallest rotation in every length-l fragment. We show that, like minimizers samples, bd-anchors samples are approximately uniform, locally consistent, and computable in linear time. Furthermore, our experiments demonstrate that the bd-anchors sample sizes decrease proportionally to l(i);and that these sizes are competitive to or smaller than the minimizers sample sizes. We theoretically justify these results by analyzing the expected size of bd-anchors samples. We also prove that computing a total order on the input alphabet which minimizes the bd-anchors sample size is NP-hard. We next highlight the benefits of bd-anchors in two important applications: text indexing and top-K similarity search. For the first application, we develop an index for performing on-line pattern searches in near-optimal time, and show experimentally that a simple implementation of our index is consistently faster for on-line pattern searches than an analogous implementation of a minimizers-based index;we also show that it is substantially faster than two classic text indexes. For the second application, we develop a heuristic for top-K similarity search under edit distance, and show experimentally that it is generally as accurate as the state-of-the-art tool for the same purpose but more than one order of magnitude faster.

关键词： string algorithms string sampling text indexing top-K string similarity search

来源：评论

学校读者我要写书评

暂无评论

Efficient computation of longest single-arm-gapped palindromes in a string

引用

THEORETICAL COMPUTER SCIENCE 2020年 812卷 160-173页

作者： Narisada, Shintaro Hendrian, Diptarama Narisawa, Kazuyuki Inenaga, Shunsuke Shinohara, Ayumi Tohoku Univ Grad Sch Informat Sci Sendai Miyagi Japan Kyushu Univ Dept Informat Fukuoka Japan

In this paper, we introduce new types of approximate palindromes called single-arm-gapped palindromes(shortly SAGPs). A SAGP contains a gap in either its left or right arm, which is in the form of either wgucu(R)w(R) or wucu(R)gw(R), where w and u are non-empty strings, w(R) and u(R) are respectively the reversed strings of wand u, g is a string called a gap, and c is either a single character or the empty string. Here we call wu and u(R) w(R) the arm of the SAGP, and vertical bar uv vertical bar the length of the arm. We classify SAGPs into two groups: those which have ucu(R) as a maximal palindrome (type-1), and the others (type-2). We propose several algorithms to compute type-1 SAGPs with longest arms occurring in a given string, based on suffix arrays. Then, we propose a linear-time algorithm to compute all type-1 SAGPs with longest arms, based on suffix trees. Also, we show how to compute type-2 SAGPs with longest arms in linear time. We also perform some preliminary experiments to show practical performances of the proposed methods. (C) 2019 Elsevier B.V. All rights reserved.

关键词： string algorithms Palindromes Suffix trees Suffix arrays

来源：评论

学校读者我要写书评

暂无评论

Covering a string

引用

ALGORITHMICA 1996年第3期16卷 288-297页

作者： Iliopoulos, CS Moore, DWG Park, K CURTIN UNIV TECHNOL SCH COMPPERTHWA 6001AUSTRALIA SEOUL NATL UNIV DEPT COMP ENGNSEOUL 151742SOUTH KOREA

We consider the problem of finding the repetitive structures of a given string x. The period u of the string x grasps the repetitiveness of x, since x is a prefix of a string constructed by concatenations of u. We generalize the concept of repetitiveness as follows: A string w covers a string I if there is a superstring of x which is constructed by concatenations and superpositions of Lu. A substring w of x is called a seed of x if w covers x. we present an O (n log n)-time algorithm for finding all the seeds of a given string of length n.

关键词： combinatorial algorithms on words string algorithms periodicity of strings covering of strings partitioning

来源：评论

学校读者我要写书评

暂无评论

Elastic-Degenerate string Matching with 1 Error or Mismatch

引用

THEORY OF COMPUTING SYSTEMS 2024年第5期68卷 1442-1467页

作者： Bernardini, Giulia Gabory, Esteban Pissis, Solon P. Stougie, Leen Sweering, Michelle Zuba, Wiktor Univ Trieste Trieste Italy CWI Amsterdam Netherlands Vrije Univ Amsterdam Netherlands

An elastic-degenerate (ED) string is a sequence of infinite sets of strings of total length N, introduced to represent a set of related DNA sequences, also known as a pan genome. The ED string matching (EDSM) problem consists in reporting all occurrences of a pattern of lengthmin an ED text. The EDSM problem has recently received some attention by the combinatorial pattern matching community, culminating in an O(nm omega-1)+O(N)-time algorithm [Bernardini et al., SIAM J. Comput. 2022], where omega denotes the matrix multiplication exponent and the O() notation suppresses poly-log factors. In the k-EDSM problem, the approximate version of EDSM, we are askedto report all pattern occurrences with at most k errors.k-EDSM can be solved inO(k2mG+kN)time, under edit distance, or O(kmG+kN)time, under Hamming distance, where G denotes the total number of strings in the ED text [Bernardiniet al., The or. Comput. Sci. 2020]. Unfortunately, G is only bounded byN, and soeven fork=1, the existing algorithms run in Omega(mN)time in the worst case. In this paper we make progress in this direction. We show that 1-EDSM can be solved inO((nm2+N)logm)orO(nm3+N)time under edit distance. For the decision version of the problem, we present a faster O(nm2 root logm+ Nlog logm)-time algorithm. We also show that 1-EDSM can be solved in O(nm2+N log m)time under Hamming distance. Our algorithms for edit distance rely on non-trivial reductions from 1-EDSM to special instances of classic computational geometry problems (2drectangle stabbing or 2d range emptiness), which we show how to solve efficiently. Inorder to obtain an even faster algorithm for Hamming distance, we rely on employing and adapting the k-errata trees for indexing with errors [Cole et al., STOC 2004]. This is an extended version of a paper presented at LATIN 2022

关键词： string algorithms Approximate string matching Edit distance Hamming distance Elastic-degenerate strings

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：