We consider a generalization of the Steiner tree problem, the Steiner forest problem, in the Euclidean plane: the input is a multiset X subset of R2, partitioned into k color classes C1,. .. , C k subset of X . The go...
详细信息
We consider a generalization of the Steiner tree problem, the Steiner forest problem, in the Euclidean plane: the input is a multiset X subset of R2, partitioned into k color classes C1,. .. , C k subset of X . The goal is to find a minimum-cost Euclidean graph G such that every color class C is connected in G . We study this Steiner forest problem in the streaming setting, where the stream consists of insertions and deletions of points to X . Each input point x is an element of X arrives with its color color(x) is an element of [k], and as usual for dynamic geometric streams, the input is restricted to the discrete grid {1, ... , Delta }2. We design a single-pass streaming algorithm that uses poly(k log Delta) space and time, and estimates the cost of an optimal Steiner forest solution within ratio arbitrarily close to the famous Euclidean Steiner ratio a 2 (currently 1.1547 <= a 2 <= 1.214). This approximation guarantee matches the state-of-the-art bound for streaming Steiner tree, i.e., when k = 1, and it is a major open question to improve the ratio to 1 + & even for this special case. Our approach relies on a novel combination of streaming techniques, like sampling and linear sketching, with the classical Arora-style dynamic-programming framework for geometric optimization problems, which usually requires large memory and so far has not been applied in the streaming setting. We complement our streaming algorithm for the Steiner forest problem with simple arguments showing that any finite multiplicative approximation requires Omega (k) bits of space.
Subspace analysis (SA) is a widely used technique for coping with high-dimensional data and is becoming a fundamental step in the early treatment of many signal-processing tasks. However, traditional SA often requires...
详细信息
Subspace analysis (SA) is a widely used technique for coping with high-dimensional data and is becoming a fundamental step in the early treatment of many signal-processing tasks. However, traditional SA often requires a large amount of memory and computational resources, as it is equivalent to eigenspace determination. To address this issue, specialized streaming algorithms have been developed, allowing SA to be run on low-power devices, such as sensors or edge devices. Here, we present a classification and a comparison of these methods by providing a consistent description and highlighting their features and similarities. We also evaluate their performance in the task of subspace identification with a focus on computational complexity and memory footprint for different signal dimensions. Additionally, we test the implementation of these algorithms on common hardware platforms typically employed for sensors and edge devices.
Submodular maximization is a significant area of interest in combinatorial optimization. It has various real-world applications. In recent years, streaming algorithms for submodular maximization have gained attention,...
详细信息
Submodular maximization is a significant area of interest in combinatorial optimization. It has various real-world applications. In recent years, streaming algorithms for submodular maximization have gained attention, allowing realtime processing of large data sets by examining each piece of data only once. However, most of the current state-of-the-art algorithms are only applicable to monotone submodular maximization. There are still significant gaps in the approximation ratios between monotone and non-monotone objective functions. In this paper, we propose a streaming algorithm framework for non-monotone submodular maximization and use this framework to design deterministic streaming algorithms for the d-knapsack constraint and the knapsack constraint. Our 1-pass streaming algorithm for the d-knapsack constraint has a 1/4(d+1)-& varepsilon;approximation ratio, using O((B) over tilde log (B) over tilde/& varepsilon;) memory, and O(log (B) over tilde & varepsilon;) query time per element, where (B) over tilde = min(n,b) is the maximum number of elements that the knapsack can store. As a special case of the d-knapsack constraint, we have the 1-pass streaming algorithm with a 1/8 - & varepsilon;approximation ratio to the knapsack constraint. To our knowledge, there is currently no streaming algorithm for this constraint when the objective function is non-monotone, even when d = 1. In addition, we propose a multi-pass streaming algorithm with 1/6 - & varepsilon;approximation, which stores O((B) over tilde )elements.
The study of non-submodular maximization on the integer lattice is an important extension of submodular optimization. In this paper, streaming algorithms for maximizing non -negative monotone non-submodular functions ...
详细信息
The study of non-submodular maximization on the integer lattice is an important extension of submodular optimization. In this paper, streaming algorithms for maximizing non -negative monotone non-submodular functions with knapsack constraint on integer lattice are considered. We first design a two-pass streamingKnapsack algorithm combining with BinarySearch as a subroutine for this problem. By introducing the DR ratio gamma and the weak DR ratio gamma w of the non-submodular objective function, we obtain that the approximation ratio is min{gamma 2(1 - epsilon)/2 gamma +1, 1 - 1/gamma w2 gamma - epsilon}, the total memory complexity is O (K log K /epsilon), and the total query complexity for each element is O (log K log(K/epsilon 2)/epsilon). Then, we design a one-pass streaming algorithm by dynamically updating the maximal function value among unit vectors along with the currently arriving element. Finally, in order to decrease the memory complexity, we design an improved streamingKnapsack algorithm and reduce the memory complexity to O (K /epsilon 2).(c) 2022 Elsevier B.V. All rights reserved.
In this work,we study a k-Cardinality Constrained Regularized Submodular Maximization(k-CCRSM)problem,in which the objective utility is expressed as the difference between a non-negative submodular and a modular *** m...
详细信息
In this work,we study a k-Cardinality Constrained Regularized Submodular Maximization(k-CCRSM)problem,in which the objective utility is expressed as the difference between a non-negative submodular and a modular *** multiplicative approximation algorithm exists for the regularized model,and most works have focused on designing weak approximation algorithms for this *** this study,we consider the k-CCRSM problem in a streaming fashion,wherein the elements are assumed to be visited individually and cannot be entirely stored in *** propose two multipass streaming algorithms with theoretical guarantees for the above problem,wherein submodular terms are monotonic and nonmonotonic.
We prove a hypercontractive inequality for matrix-valued functions defined over large alphabets. In order to do so, we prove a generalization of the powerful 2-uniform convexity inequality for trace norms of Ball, Car...
详细信息
We prove a hypercontractive inequality for matrix-valued functions defined over large alphabets. In order to do so, we prove a generalization of the powerful 2-uniform convexity inequality for trace norms of Ball, Carlen, Lieb (Inventiones Mathematicae'94). Using our hypercontractive inequality, we present upper and lower bounds for the communication complexity of the Hidden Hypermatching problem defined over large alphabets. We then consider streaming algorithms for approximating the value of Unique Games on a hypergraph with t-size hyperedges. By using our communication lower bound, we show that every streaming algorithm in the adversarial model achieving an (r - pound )-approximation of this value requires Omega( n 1-2 / t ) quantum space, where r is the alphabet size. We next present a lower bound for locally decodable codes (LDC) Z n r -> Z N r over large alphabets with recoverability probability at least 1/r + pound . Using hypercontractivity, we give an exponential lower bound N = 2 Omega(epsilon 4 n/r 4 ) for 2-query (possibly non-linear) LDCs over Z r and using the non-commutative Khintchine inequality we prove an improved lower bound of N = 2 Omega(epsilon 2 n/r 2 ) .
Estimating set similarity and detecting highly similar sets are fundamental problems in areas such as databases and machine learning. MinHash is a well-known technique for approximating Jaccard similarity of sets and ...
详细信息
Estimating set similarity and detecting highly similar sets are fundamental problems in areas such as databases and machine learning. MinHash is a well-known technique for approximating Jaccard similarity of sets and has been successfully used for many applications. Its two compressed versions, b-bit MinHash and Odd Sketch, can significantly reduce the memory usage of the MinHash, especially for estimating high similarities (i.e., similarities around 1). Although MinHash can be applied to static sets as well as streaming sets, of which elements are given in a streaming fashion, unfortunately, b-bit MinHash and Odd Sketch fail to deal with streaming data. To solve this problem, we previously designed a memory-efficient sketch method, MaxLogHash, to accurately estimate Jaccard similarities in streaming sets. Compared with MinHash, our method uses smaller sized registers (each register consists of less than 7 bits) to build a compact sketch for each set. In this paper, we further develop a faster method, MaxLogOPH++. Compared with MaxLogHash, MaxLogOPH++ reduces the time complexity for updating each coming element from O(k) to O(1) with a small additional memory. We conduct experiments on a variety of datasets, and experimental results demonstrate the efficiency and effectiveness of our methods.
Problems involving the efficient arrangement of simple objects, as captured by bin packing and makespan scheduling, are fundamental tasks in combinatorial optimization. These are well understood in the traditional onl...
详细信息
Problems involving the efficient arrangement of simple objects, as captured by bin packing and makespan scheduling, are fundamental tasks in combinatorial optimization. These are well understood in the traditional online and offline cases, but have been less well-studied when the volume of the input is truly massive, and cannot even be read into memory. This is captured by the streaming model of computation, where the aim is to approximate the cost of the solution in one pass over the data, using small space. As a result, streaming algorithms produce concise input summaries that approximately preserve the optimum value. We design the first efficient streaming algorithms for these fundamental problems in combinatorial optimization. For Bin Packing, we provide a streaming asymptotic (1 + epsilon)-approximation with (O) over tilde (1/epsilon), where (O) over tilde hides logarithmic factors. Moreover, such a space bound is essentially optimal. Our algorithm implies a streaming (d + epsilon)-approximation for VECTOR BIN PACKING in d dimensions, running in space (O) over tilde (d/epsilon). For the related VECTOR SCHEDULING problem, we show how to construct an input summary in space (O) over tilde (d(2) . m/epsilon(2)) that preserves the optimum value up to a factor of 2 - 1/m + epsilon, where m is the number of identical machines.
We consider maximizing a monotone submodular function under a cardinality constraint or a knapsack constraint in the streaming setting. In particular, the elements arrive sequentially and at any point of time, the alg...
详细信息
We consider maximizing a monotone submodular function under a cardinality constraint or a knapsack constraint in the streaming setting. In particular, the elements arrive sequentially and at any point of time, the algorithm has access to only a small fraction of the data stored in primary memory. We propose the following streaming algorithms taking O(epsilon(-1)) passes: (1) a (1-e(-1)-epsilon)-approximation algorithm for the cardinality-constrained problem, (2) a (0.5 - epsilon)-approximation algorithm for the knapsack-constrained problem. Both of our algorithms run deterministically in O* (n) time, using O* (K) space, where n is the size of the ground set and K is the size of the knapsack. Here the term O* hides a polynomial of log K and epsilon(-1). Our streaming algorithms can also be used as fast approximation algorithms. In particular, for the cardinality-constrained problem, our algorithm takes O (n epsilon(-1) log (epsilon(-1) log K)) time, improving on the algorithm of Badanidiyuru and Vondrak that takes O (n epsilon(-1)log (epsilon(-1) K)) time.
We give an (O) over tilde(root n)-space single-pass 0.483-approximation streaming algorithm for estimating the maximum directed cut size (Max-DICUT) in a directed graph on n vertices. This improves over an O(log n)-sp...
详细信息
ISBN:
(纸本)9798350318944
We give an (O) over tilde(root n)-space single-pass 0.483-approximation streaming algorithm for estimating the maximum directed cut size (Max-DICUT) in a directed graph on n vertices. This improves over an O(log n)-space 4/9 < 0.45 approximation algorithm due to Chou, Golovnev, and Velusamy (FOCS 2020), which was known to be optimal for o(root n)-space algorithms. Max-DICUT is a special case of a constraint satisfaction problem (CSP). In this broader context, we give the first CSP for which algorithms with <($)over tilde>(root n) space can provably outperform o(root n)-space algorithms. The key technical contribution of our work is development of the notions of a first-order snapshot of a (directed) graph and of estimates of such snapshots. These snapshots can be used to simulate certain (non-streaming) Max- DICUT algorithms, including the "oblivious" algorithms introduced by Feige and Jozeph (Algorithmica, 2015), who showed that one such algorithm achieves a 0.483-approximation. Previous work of the authors (SODA 2023) studied the restricted case of bounded-degree graphs, and observed that in this setting, it is straightforward to estimate the snapshot with l(1) errors and this suffices to simulate oblivious algorithms. But for unbounded-degree graphs, even defining an achievable and sufficient notion of estimation is subtle. We describe a new notion of snapshot estimation and prove its sufficiency using careful smoothing techniques, and then develop an algorithm which sketches such an estimate via a delicate process of intertwined vertex- and edge-subsampling. Prior to our work, the only streaming algorithms for any CSP on general instances were based on generalizations of the O(log n)-space algorithm for Max-DICUT, and can roughly be characterized as based on "zeroth" order snapshots. Our work thus opens the possibility of a new class of algorithms for approximating CSPs by demonstrating that more sophisticated snapshots can outperform cruder ones in the case of
暂无评论