Compressed Suffix Arrays (CSAs) offer the same functionality as classical suffix arrays (SAs), and more, within space close to that of the compressed text, and in addition they can reproduce any text fragment. Further...
详细信息
Compressed Suffix Arrays (CSAs) offer the same functionality as classical suffix arrays (SAs), and more, within space close to that of the compressed text, and in addition they can reproduce any text fragment. Furthermore, their pattern search times are comparable to those of SAs. This combination has made CSAs extremely successful substitutes for SAs in space-demanding applications. Their weakest point is that they are orders of magnitude slower when retrieving the precise positions of pattern occurrences. SAs have other well-known shortcomings, inherited by CSAs, such as not retrieving those positions in a useful order. In this paper we present new techniques that, on the one hand, improve the current space/time tradeoffs for retrieving pattern occurrences in CSAs, and on the other, efficiently support extended pattern locating functionalities, such as retrieving occurrences in text order or limiting the occurrences to within a text window. Our experimental results display considerable savings with respect to the baseline techniques in many cases of interest: in some cases we outperform their time by a factor of two or three. (C) 2015 Elsevier B. V. All rights reserved.
In the field of online algorithms, paging is a well-studied problem. LRU is a simple paging algorithm that incurs few cache misses and supports efficient implementations. algorithms outperforming LRU in terms of cache...
详细信息
Succinct data structures provide the same functionality as their corresponding traditional data structure in compact space. We improve on functions rank and select, which are the basic building blocks of FM-indexes an...
详细信息
Succinct data structures provide the same functionality as their corresponding traditional data structure in compact space. We improve on functions rank and select, which are the basic building blocks of FM-indexes and other succinct data structures. First, we present a cache-optimal, uncompressed bitvector representation that outperforms all existing approaches. Next, we improve, in both space and time, on a recent result by Navarro and Providel on compressed bitvectors. Last, we show techniques to perform rank and select on 64-bit words that are up to three times faster than existing methods. In our experimental evaluation, we first show how our improvements affect cache and runtime performance of both operations on data sets larger than commonly used in the evaluation of succinct data structures. Our experiments show that our improvements to these basic operations significantly improve the runtime performance and compression effectiveness of FM-indexes on small and large data sets. To our knowledge, our improvements result in FM-indexes that are either smaller or faster than all current state of the art implementations. Copyright (C) 2013 John Wiley & Sons, Ltd.
Best connections in real networks are usually found by applying Dijkstra's shortest paths algorithm. Unfortunately, networks deriving from real-world applications are huge, yielding unsustainable times to compute ...
详细信息
Best connections in real networks are usually found by applying Dijkstra's shortest paths algorithm. Unfortunately, networks deriving from real-world applications are huge, yielding unsustainable times to compute shortest paths. Therefore, considerable research has been conducted in recent years to accelerate Dijkstra's algorithm on typical instances of transportation and communication networks, such as road networks. These efforts have led to the development of many so called speed-up techniques, as for example Arc-Flags. The main drawback of many of these techniques, including Arc-Flags, is that they do not work well in the realistic dynamic scenarios where the networks change over time. In this article, we introduce a new data structure, named Road-Signs, which is used to update the Arc-Flags of a graph in fully dynamic scenarios. Road-Signs can be used to compute Arc-Flags, can be efficiently updated and does not require large space consumption for sparse networks. We develop a fully dynamic algorithm for updating Arc-Flags, by updating Road-Signs, each time that a modification occurs on an edge of the network. We show that this algorithm is better than recomputation from scratch of Arc-Flags in terms of the affected parameters of the input, which makes this solution suitable to be efficient in practice. However, it is not better than recomputation from scratch in the worst case. We also propose an experimental study to evaluate the practical performance of the new dynamic algorithm. To this aim, we use real-world road networks subject to sequences of weight change operations. Our experiments show a significant speed-up in the updating phase with respect to the recomputation from scratch of *** (c) 2014 Wiley Periodicals, Inc. NETWORKS, Vol. 63(3), 243-259 2014
A sunflower in a hypergraph is a set of hyperedges pairwise intersecting in exactly the same vertex set. Sunflowers are a useful tool in polynomial-time data reduction for problems formalizable as d-Hitting Set, the p...
详细信息
A sunflower in a hypergraph is a set of hyperedges pairwise intersecting in exactly the same vertex set. Sunflowers are a useful tool in polynomial-time data reduction for problems formalizable as d-Hitting Set, the problem of covering all hyperedges (whose cardinality is bounded from above by a constant d) of a hypergraph by at most k vertices. Additionally, in fault diagnosis, sunflowers yield concise explanations for "highly defective structures". We provide a linear-time algorithm that, by finding sunflowers, transforms an instance of d-Hitting Set into an equivalent instance comprising at most O(k (d) ) hyperedges and vertices. In terms of parameterized complexity, we show a problem kernel with asymptotically optimal size (unless ) and provide experimental results that show the practical applicability of our algorithm. Finally, we show that the number of vertices can be reduced to O(k (d-1)) with additional processing in O(k (1.5d) ) time-nontrivially combining the sunflower technique with problem kernels due to Abu-Khzam and Moser.
We study the computation of good alternatives to the shortest path in road networks. Our approach is based on single via-node routing on top of contraction hierarchies and achieves superior quality and efficiency comp...
详细信息
Online mapping and navigation services are a corner stone of the World Wide Web. While automatically generated car directions have gone from static data to user-specific customizations, automatically generated route g...
详细信息
ISBN:
(纸本)9781450331418
Online mapping and navigation services are a corner stone of the World Wide Web. While automatically generated car directions have gone from static data to user-specific customizations, automatically generated route guidance has stayed very much the same. And as such, it doesn't necessarily reflect the complexity of turns and may also be hard to use while traveling. We report on a practical and efficient technique for computing and presenting generalizations of routes, so-called sketches that compress a route and its guidance into an easy to comprehend and usable form by non-uniformly scaling and generalizing certain parts of the route. Our work draws from a sound theoretical foundation, is easy to implement, and gives very good results in practice.
Geometric discrepancies are standard measures to quantify the irregularity of distributions. They are an important notion in numerical integration. One of the most important discrepancy notions is the so-called star d...
详细信息
ISBN:
(纸本)9781450319638
Geometric discrepancies are standard measures to quantify the irregularity of distributions. They are an important notion in numerical integration. One of the most important discrepancy notions is the so-called star discrepancy. Roughly speaking, a point set of low star discrepancy value allows for a small approximation error in quasi-Monte Carlo integration. It is thus the most studied discrepancy notion. In this work we present a new algorithm to compute point sets of low star discrepancy. The two components of the algorithm (for the optimization and the evaluation, respectively) are based on evolutionary principles. Our algorithm clearly outperforms existing approaches. To the best of our knowledge, it is also the first algorithm which can be adapted easily to optimize inverse star discrepancies.
Given a graph G=(V,E), the independent set problem is that of finding a maximum-cardinality subset S of V such that no two vertices in S are adjacent. We introduce two fast local search routines for this problem. The ...
详细信息
Given a graph G=(V,E), the independent set problem is that of finding a maximum-cardinality subset S of V such that no two vertices in S are adjacent. We introduce two fast local search routines for this problem. The first can determine in linear time whether a maximal solution can be improved by replacing a single vertex with two others. The second routine can determine in O(m Delta) time (where Delta is the highest degree in the graph) whether there are two solution vertices than can be replaced by a set of three. We also present a more elaborate heuristic that successfully applies local search to find near-optimum solutions to a wide variety of instances. We test our algorithms on instances from the literature as well as on new ones proposed in this paper.
This report introduces a new lossless asymmetric single instruction multiple data codec designed for extremely efficient decompression of large satellite images. A throughput in excess of 3GB/s allows decompression to...
详细信息
This report introduces a new lossless asymmetric single instruction multiple data codec designed for extremely efficient decompression of large satellite images. A throughput in excess of 3GB/s allows decompression to proceed in parallel with asynchronous transfers from fast block devices such as disk arrays. This is made possible by a simple and fast single instruction multiple data entropy coder that removes leading null bits. Our main contribution is a new approach for vectorized prediction and encoding. Unlike previous approaches that treat the entropy coder as a black box, we account for its properties in the design of the predictor. The resulting compressed stream is 1.2 to 1.5 times as large as JPEG-2000, but can be decompressed 100 times as quickly even faster than copying uncompressed data in memory. Applications include streaming decompression for out of core visualization. To the best of our knowledge, this is the first entirely vectorized algorithm for lossless compression. Copyright (c) 2011 John Wiley & Sons, Ltd.
暂无评论