In this paper, we focus on estimating the distribution of underlying parameter over random networks through reconstructing the empirical distribution of initial samples, which can be viewed as a particular average con...
详细信息
In this paper, we focus on estimating the distribution of underlying parameter over random networks through reconstructing the empirical distribution of initial samples, which can be viewed as a particular average consensus problem. A class of quantized communication protocol, in which neighbors exchange information randomly selected based on the current estimate,is considered. To improve the convergence rate of this distributed random sampling algorithm, we introduce the Polyak average scheme, and show that asymptotic efficiency can be achieved through the averaging technique under proper conditions. The results show that the minimum limit covariance matrix of estimation error can be reached, i.e., the proposed algorithm achieves the highest possible rate of convergence. Finally, we provide a numerical simulation to validate the theoretical results of this work.
The subset-sum problem (SSP) is defined as follows: given a positive integer bound and a set of n positive integers find a subset whose sum is closest to, but not greater than, the bound. We present a randomized appro...
详细信息
The subset-sum problem (SSP) is defined as follows: given a positive integer bound and a set of n positive integers find a subset whose sum is closest to, but not greater than, the bound. We present a randomized approximation algorithm for this problem with linear space complexity and time complexity of O(n log n). Experiments with random uniformly-distributed instances of SSP show that our algorithm outperforms, both in running time and average error, Martello and Toth's (1984) quadratic greedy search, whose time complexity is O(n2). We propose conjectures on the expected error of our algorithm for uniformly-distributed instances of SSP and provide some analytical arguments justifying these conjectures. We present also results of numerous tests. International Federation of Operational Research Societies 2002.
Least squares approximation is a technique for finding an approximate solution to a system of linear equations that has no exact solution. In this paper, we are concerned with fast randomized approximation of Linear L...
详细信息
Least squares approximation is a technique for finding an approximate solution to a system of linear equations that has no exact solution. In this paper, we are concerned with fast randomized approximation of Linear Least Squares (LLS) problems. Two algorithms are presented and compared: the first employing the combination of count-sketch with Subsampled randomized Hadamard transform and in the second we combine count-sketch with a Gaussian projection. Both algorithms make use of QR factorization. The condition number of randomized LLS is computed. Finally, an application in space geodesy is presented for computing the geopotential harmonic coefficients with the aim of computing the gravitational potential.
This paper considers subspace recovery in the presence of outliers in a decentralized setting. The intrinsic low-dimensional geometry of the data is exploited to substantially reduce the processing and communication o...
详细信息
ISBN:
(纸本)9781509018253
This paper considers subspace recovery in the presence of outliers in a decentralized setting. The intrinsic low-dimensional geometry of the data is exploited to substantially reduce the processing and communication overhead given limited sensing and communication resources at the sensing nodes. A small subset of the data is first selected. The data is embedded into a random low-dimensional subspace then forwarded to a central processing unit that runs a low-complexity algorithm to recover the subspace directly from the data sketch. We derive sufficient conditions on the compression and communication rates to successfully recover the subspace with high probability. It is shown that the proposed approach is robust to outliers and its complexity is independent of the dimension of the whole data matrix. The proposed algorithm provably achieves notable speedups in comparison to existing approaches for robust subspace recovery.
We study the query complexity of finding the set of all Nash equilibria \(\mathcal {X}_\ast \times \mathcal {Y}_\ast\) in two-player zero-sum matrix games. Fearnley and Savani [18] showed that for any randomized algor...
详细信息
We study the query complexity of finding the set of all Nash equilibria \(\mathcal {X}_\ast \times \mathcal {Y}_\ast\) in two-player zero-sum matrix games. Fearnley and Savani [18] showed that for any randomized algorithm, there exists an n × n input matrix where it needs to query Ω(n2) entries in expectation to compute a single Nash equilibrium. On the other hand, Bienstock et al. [5] showed that there is a special class of matrices for which one can query O(n) entries and compute its set of all Nash equilibria. However, these results do not fully characterize the query complexity of finding the set of all Nash equilibria in two-player zero-sum matrix *** this work, we characterize the query complexity of finding the set of all Nash equilibria \(\mathcal {X}_\ast \times \mathcal {Y}_\ast\) in terms of the number of rows n of the input matrix \(A \in \mathbb {R}^{n \times n}\), row support size \(k_1 := |\bigcup \limits _{x \in \mathcal {X}_\ast } \text{supp}(x)|\), and column support size \(k_2 := |\bigcup \limits _{y \in \mathcal {Y}_\ast } \text{supp}(y)|\). We design a simple yet non-trivial randomized algorithm that returns the set of all Nash equilibria \(\mathcal {X}_\ast \times \mathcal {Y}_\ast\) by querying at most O(nk5 · polylog(n)) entries of the input matrix \(A \in \mathbb {R}^{n \times n}\) in expectation, where k ≔ max{k1, k2}. This upper bound is tight up to a factor of poly(k), as we show that for any randomized algorithm, there exists an n × n input matrix with min {k1, k2} = 1, for which it needs to query Ω(nk) entries in expectation in order to find the set of all Nash equilibria \(\mathcal {X}_\ast \times \mathcal {Y}_\ast\).
One popular method for dealing with large-scale data sets is sampling. For example, by using the empirical statistical leverage scores as an importance sampling distribution, the method of algorithmic leveraging sampl...
详细信息
One popular method for dealing with large-scale data sets is sampling. For example, by using the empirical statistical leverage scores as an importance sampling distribution, the method of algorithmic leveraging samples and rescales rows/columns of data matrices to reduce the data size before performing computations on the subproblem. This method has been successful in improving computational efficiency of algorithms for matrix problems such as least-squares approximation, least absolute deviations approximation, and low-rank matrix approximation. Existing work has focused on algorithmic issues such as worst-case running times and numerical issues associated with providing high-quality implementations, but none of it addresses statistical aspects of this *** this paper, we provide a simple yet effective framework to evaluate the statistical properties of algorithmic leveraging in the context of estimating parameters in a linear regression model with a fixed number of predictors. In particular, for several versions of leverage-based sampling, we derive results for the bias and variance, both conditional and unconditional on the observed data. We show that from the statistical perspective of bias and variance, neither leverage-based sampling nor uniform sampling dominates the other. This result is particularly striking, given the well-known result that, from the algorithmic perspective of worst-case analysis, leverage-based sampling provides uniformly superior worst-case algorithmic results, when compared with uniform *** on these theoretical results, we propose and analyze two new leveraging algorithms: one constructs a smaller least-squares problem with "shrinkage" leverage scores (SLEV), and the other solves a smaller and unweighted (or biased) least-squares problem (LEVUNW). A detailed empirical evaluation of existing leverage-based methods as well as these two new methods is carried out on both synthetic and real data sets. The empirical results
The statistical leverage scores of a matrix A are the squared row-norms of the matrix containing its (top) left singular vectors and the coherence is the largest leverage score. These quantities are of interest in rec...
详细信息
The statistical leverage scores of a matrix A are the squared row-norms of the matrix containing its (top) left singular vectors and the coherence is the largest leverage score. These quantities are of interest in recently-popular problems such as matrix completion and Nyström-based low-rank matrix approximation as well as in large-scale statistical data analysis applications more generally; moreover, they are of interest since they define the key structural nonuniformity that must be dealt with in developing fast randomized matrix algorithms. Our main result is a randomized algorithm that takes as input an arbitrary n × d matrix A, with n ≫ d, and that returns as output relative-error approximations to all n of the statistical leverage scores. The proposed algorithm runs (under assumptions on the precise values of n and d) in O(nd logn) time, as opposed to the O(nd2) time required by the naïve algorithm that involves computing an orthogonal basis for the range of A. Our analysis may be viewed in terms of computing a relative-error approximation to an underconstrained least-squares approximation problem, or, relatedly, it may be viewed as an application of Johnson-Lindenstrauss type ideas. Several practically-important extensions of our basic result are also described, including the approximation of so-called cross-leverage scores, the extension of these ideas to matrices with n ≈ d, and the extension to streaming environments.
暂无评论