Clique-width is a well-studied graph parameter owing to its use in understanding algorithmic traceability, and in this paper, we study the class of bounded clique-width graphs through the lens of succinctdata structu...
详细信息
Clique-width is a well-studied graph parameter owing to its use in understanding algorithmic traceability, and in this paper, we study the class of bounded clique-width graphs through the lens of succinct data structures. A data structure is said to be succinct if the amount of space used by the data structure is information-theoretically optimal up to lower-order additive terms. More specifically, we design a succinctdata structure for graphs on n vertices whose clique-width is at most k <= & varepsilon;root logn for some constant 0 < & varepsilon;< 1/6, that supports degree, adjacency, and neighborhood queries efficiently. This resolves an open problem of Kamali (Algorithmica-2018). As an application of our main technique, we also propose succinct data structures for distance-hereditary and Ptolemaic graphs, which are subclasses of the class of graphs with clique-width at most 3. (c) 2024 Elsevier B.V. All rights reserved.
We give succinct data structures that store a tree with colors on the nodes. Given a node x and a color alpha, the structures find the nearest node to x with color alpha. Our results improve the O(n log n)-bits struct...
详细信息
We give succinct data structures that store a tree with colors on the nodes. Given a node x and a color alpha, the structures find the nearest node to x with color alpha. Our results improve the O(n log n)-bits structure of Gawrychowski et al. (2016) [12]. (C) 2017 Elsevier B.V. All rights reserved.
We propose succinct data structures for text retrieval systems supporting document listing queries and ranking queries based on the tf *idf (term frequency times inverse document frequency) scores of documents. Tradit...
详细信息
We propose succinct data structures for text retrieval systems supporting document listing queries and ranking queries based on the tf *idf (term frequency times inverse document frequency) scores of documents. Traditional datastructures for these problems support queries only for some predetermined keywords. Recently Muthukrishnan proposed a data structure for document listing queries for arbitrary patterns at the cost of data structure size. For computing the tf * idf scores there has been no efficient datastructures for arbitrary patterns. Our new datastructures support these queries using small space. The space is only 2/epsilon times the size of compressed documents plus 10n bits for a document collection of length n, for any 0 < epsilon <= 1. This is much smaller than the previous O(n logn) bit datastructures. Query time is O( m+q log(epsilon) n) for listing and computing tf *idf scores for all q documents containing a given pattern of length m. Our datastructures are flexible in a sense that they support queries for arbitrary patterns. (C) 2006 Elsevier B.V. All rights reserved.
succinct data structures are used today in many information retrieval applications, e.g., posting lists representation, language model representation, indexing (social) graphs, query auto-completion, document retrieva...
详细信息
ISBN:
(纸本)9781450340694
succinct data structures are used today in many information retrieval applications, e.g., posting lists representation, language model representation, indexing (social) graphs, query auto-completion, document retrieval and indexing dictionary of strings, just to mention the most recent ones. These new kind of datastructures mimic the operations of their classical counterparts within a comparable time complexity but require much less space. With the availability of several libraries for basic succinctstructures - like SDSL, succinct, Facebook's Folly, and Sux - it is relatively easy to directly profit from advances in this field. In this tutorial we will introduce this field of research by presenting the most important succinct data structures to represent set of integers, set of points, trees, graphs and strings together with their most important applications to Information Retrieval problems. The introduction of the succinct data structures will be sustained with a practical session with programming handouts to solve. This will allow the attendees to directly experiment with implementations of these solutions on real datasets and understand the potential benefits they can bring on their own projects.
We focus on succinct data structures, that is on time and space efficient representations of trees and other combinatorial objects that dominate the memory requirements of most sophisticated programs and systems.
ISBN:
(纸本)9781450343800
We focus on succinct data structures, that is on time and space efficient representations of trees and other combinatorial objects that dominate the memory requirements of most sophisticated programs and systems.
We consider time-space tradeoffs for static data structure problems in the cell probe model with word size I (the bit probe model). In this model, the goal is to represent n -bit data with s = n + r bits such that que...
详细信息
We consider time-space tradeoffs for static data structure problems in the cell probe model with word size I (the bit probe model). In this model, the goal is to represent n -bit data with s = n + r bits such that queries (of a certain type) about the data can be answered by reading at most t bits of the representation. Ideally, we would like to keep both s and t small, but there are tradeoffs between the values of s and t that limit the possibilities of keeping both parameters small. In this paper, we consider the case of succinct representations, where s = n + r for some redundancy r << n. For a Boolean version of the problem of polynomial evaluation with preprocessing of coefficients, we show a lower bound on the redundancy-query time tradeoff of the form (r + 1)t >= Omega(n/ log n). In particular, for very small redundancies r, we get an almost optimal lower bound stating that the query algorithm has to inspect almost the entire data structure (up to a logarithmic factor). We show similar lower bounds for problems satisfying a certain combinatorial properties of a coding theoretic flavor, and obtain (r + 1)t >= Omega(n) for certain problems. Previously, no omega(m) lower bounds were known on t in the general model for explicit Boolean problems, even for very small redundancies. By restricting our attention to systematic or index structures phi satisfying phi (x) = x . phi* (x) for some map phi* (where - denotes concatenation), we show similar lower bounds on the redundancy-query time tradeoff for the natural data structuring problems of Prefix Sum and Substring Search. (c) 2007 Elsevier B.V All rights reserved.
succinct data structures provide the same functionality as their corresponding traditional data structure in compact space. We improve on functions rank and select, which are the basic building blocks of FM-indexes an...
详细信息
succinct data structures provide the same functionality as their corresponding traditional data structure in compact space. We improve on functions rank and select, which are the basic building blocks of FM-indexes and other succinct data structures. First, we present a cache-optimal, uncompressed bitvector representation that outperforms all existing approaches. Next, we improve, in both space and time, on a recent result by Navarro and Providel on compressed bitvectors. Last, we show techniques to perform rank and select on 64-bit words that are up to three times faster than existing methods. In our experimental evaluation, we first show how our improvements affect cache and runtime performance of both operations on data sets larger than commonly used in the evaluation of succinct data structures. Our experiments show that our improvements to these basic operations significantly improve the runtime performance and compression effectiveness of FM-indexes on small and large data sets. To our knowledge, our improvements result in FM-indexes that are either smaller or faster than all current state of the art implementations. Copyright (C) 2013 John Wiley & Sons, Ltd.
In this paper, we study different approaches for rank and select on sequences of bytes and propose new implementation strategies. Extensive experimental evaluation comparing the efficiency of the different alternative...
详细信息
In this paper, we study different approaches for rank and select on sequences of bytes and propose new implementation strategies. Extensive experimental evaluation comparing the efficiency of the different alternatives are provided. Given a sequence of bits, a rank query counts the number of occurrences of the bit 1 up to a given position, and a select query returns the position of the ith occurrence of the bit 1. These operations are widely used in information retrieval and management, being the base of several datastructures and algorithms for text collections, graphs, etc. There exist solutions for computing these operations on sequences of bits in constant time using additional information. However, new applications require rank and select to be computed on sequences of bytes instead of bits. The solutions for the binary case are not directly applicable to sequences of bytes. The existing solutions for the byte case vary in their space-time trade-off which can still be improved.
We consider time-space tradeoffs for static data structure problems in the cell probe model with word size I (the bit probe model). In this model, the goal is to represent n -bit data with s = n + r bits such that que...
详细信息
ISBN:
(纸本)3540404937
We consider time-space tradeoffs for static data structure problems in the cell probe model with word size I (the bit probe model). In this model, the goal is to represent n -bit data with s = n + r bits such that queries (of a certain type) about the data can be answered by reading at most t bits of the representation. Ideally, we would like to keep both s and t small, but there are tradeoffs between the values of s and t that limit the possibilities of keeping both parameters small. In this paper, we consider the case of succinct representations, where s = n + r for some redundancy r << n. For a Boolean version of the problem of polynomial evaluation with preprocessing of coefficients, we show a lower bound on the redundancy-query time tradeoff of the form (r + 1)t >= Omega(n/ log n). In particular, for very small redundancies r, we get an almost optimal lower bound stating that the query algorithm has to inspect almost the entire data structure (up to a logarithmic factor). We show similar lower bounds for problems satisfying a certain combinatorial properties of a coding theoretic flavor, and obtain (r + 1)t >= Omega(n) for certain problems. Previously, no omega(m) lower bounds were known on t in the general model for explicit Boolean problems, even for very small redundancies. By restricting our attention to systematic or index structures phi satisfying phi (x) = x . phi* (x) for some map phi* (where - denotes concatenation), we show similar lower bounds on the redundancy-query time tradeoff for the natural data structuring problems of Prefix Sum and Substring Search. (c) 2007 Elsevier B.V All rights reserved.
A data structure is called succinct if its asymptotical space requirement matches the original data size. The development of succinct data structures is an important factor to deal with the explosively increasing big ...
详细信息
A data structure is called succinct if its asymptotical space requirement matches the original data size. The development of succinct data structures is an important factor to deal with the explosively increasing big data. Moreover, wider variations of big data have been produced in various fields recently and there is a substantial need for the development of more application-specific succinct data structures. In this study, we review the recently proposed application-oriented succinct data structures motivated by big data applications in three different fields: privacy-preserving computation in cryptography, genome assembly in bioinformatics, and work space reduction for compressed communications.
暂无评论