spectral algorithms are some of the main tools in optimization and inference problems on graphs. Typically, the graph is encoded as a matrix and eigenvectors and eigenvalues of the matrix are then used to solve the gi...
详细信息
spectral algorithms are some of the main tools in optimization and inference problems on graphs. Typically, the graph is encoded as a matrix and eigenvectors and eigenvalues of the matrix are then used to solve the given graph problem. spectral algorithms have been successfully used for graph partitioning, hidden clique recovery and graph coloring. In this paper, we study the power of spectral algorithms using two matrices in a graph partitioning problem. We use two different matrices resulting from two different encodings of the same graph and then combine the spectral information coming from these two matrices. We analyze a two-matrix spectral algorithm for the problem of identifying latent community structure in large random graphs. In particular, we consider the problem of recovering community assignments exactly in the censored stochastic block model, where each edge status is revealed independently with some probability. We show that spectral algorithms based on two matrices are optimal and succeed in recovering communities up to the information theoretic threshold. Further, we show that for most choices of the parameters, any spectral algorithm based on one matrix is suboptimal. The latter observation is in contrast to our prior works (2022a, 2022b) which showed that for the symmetric Stochastic Block Model and the Planted Dense Subgraph problem, a spectral algorithm based on one matrix achieves the information theoretic threshold. We additionally provide more general geometric conditions for the (sub)-optimality of spectral algorithms.
We study a class of spectral learning methods with dependent observations, including popular ridge regression, Landweber iteration, spectral cut-off and so on. We derive an explicit risk bound in terms of the correlat...
详细信息
We study a class of spectral learning methods with dependent observations, including popular ridge regression, Landweber iteration, spectral cut-off and so on. We derive an explicit risk bound in terms of the correlation of the observations, regularity of the regression function, and effective dimension of the reproducing kernel Hilbert space. By appropriately choosing regularization parameter according to the sample size, the risk bound yields a nearly optimal learning rate with a logarithmic term for strongly mixing sequences. We thus extend the applicable range of spectral algorithm to non-i.i.d. sampling process. Particularly, it is shown that the learning rates for i.i.d. samples in the literature refer to our special case, i.e., the mixing condition parameter tends to zero.& COPY;2023 Elsevier B.V. All rights reserved.
In the misspecified spectral algorithms problem, researchers usually assume the underground true function f*ρ ∈ [ℌ]s, a less-smooth interpolation space of a reproducing kernel Hilbert space (RKHS) ℌ for some s ∈ (0...
详细信息
In the misspecified spectral algorithms problem, researchers usually assume the underground true function f*ρ ∈ [ℌ]s, a less-smooth interpolation space of a reproducing kernel Hilbert space (RKHS) ℌ for some s ∈ (0, 1). The existing minimax optimal results require ||f*ρ||L∞ < ∞ which implicitly requires s > α0 where α0 ∈ (0, 1) is the embedding index, a constant depending on ℌ. Whether the spectral algorithms are optimal for all s ∈ (0, 1) is an outstanding problem lasting for years. In this paper, we show that spectral algorithms are minimax optimal for any α0 - 1/β < s < 1, where β is the eigenvalue decay rate of ℌ. We also give several classes of RKHSs whose embedding index satisfies α0 = 1/β. Thus, the spectral algorithms are minimax optimal for all s ∈ (0, 1) on these RKHSs.
We provide the first online algorithm for spectral hypergraph sparsification. In the online setting, hyperedges with positive weights are arriving in a stream, and upon the arrival of each hyperedge, we must irrevocab...
详细信息
ISBN:
(纸本)9783031598340;9783031598357
We provide the first online algorithm for spectral hypergraph sparsification. In the online setting, hyperedges with positive weights are arriving in a stream, and upon the arrival of each hyperedge, we must irrevocably decide whether or not to include it in the sparsifier. Our algorithm produces an (epsilon, delta)-spectral sparsifier with multiplicative error e and additive error delta that has O(epsilon(-2) n log n log r log(1+ epsilon W/delta n)) hyperedges with high probability, where epsilon, delta is an element of (0, 1), n is the number of nodes, r is the rank of the hypergraph, and W is the sum of edge weights. The space complexity of our algorithm is O(n(2)), while previous algorithms required space complexity Omega(m), where m is the number of hyperedges. This provides an exponential improvement in the space complexity since m can be exponential in n.
We consider a family of well-known scheduling problems that reduce to the problem of finding a minimum weighted clique in a complete weighted graph with negative weights and self-loops allowed. We present a uniform al...
详细信息
We consider a family of well-known scheduling problems that reduce to the problem of finding a minimum weighted clique in a complete weighted graph with negative weights and self-loops allowed. We present a uniform algorithmic approach to finding optimal as well as suboptimal solutions for these problems. Also, we report results of computational tests for suboptimal algorithms developed in the paper.
We propose a new algorithm for spectral learning of Hidden Markov Models (HMM). In contrast to the standard approach, we do not estimate the parameters of the HMM directly, but construct an estimate for the joint prob...
详细信息
We propose a new algorithm for spectral learning of Hidden Markov Models (HMM). In contrast to the standard approach, we do not estimate the parameters of the HMM directly, but construct an estimate for the joint probability distribution. The idea is based on the representation of a joint probability distribution as an N-th-order tensor with low ranks represented in the tensor train (TT) format. Using TT-format, we get an approximation by minimizing the Frobenius distance between the empirical joint probability distribution and tensors with low TT-ranks with core tensors normalization constraints. We propose an algorithm for the solution of the optimization problem that is based on the alternating least squares (ALS) approach and develop its fast version for sparse tensors. The order of the tensor d is a parameter of our algorithm. We have compared the performance of our algorithm with the existing algorithm by Hsu, Kakade and Zhang proposed in 2009 and found that it is much more robust if the number of hidden states is overestimated.
We study the matrix completion problem: an underlying m x n matrix P is low rank, with incoherent singular vectors, and a random m x n matrix A is equal to P on a (uniformly) random subset of entries of size dn. All o...
详细信息
We study the matrix completion problem: an underlying m x n matrix P is low rank, with incoherent singular vectors, and a random m x n matrix A is equal to P on a (uniformly) random subset of entries of size dn. All other entries of A are equal to zero. The goal is to retrieve information on P from the observation of A. Let A(1) be the random matrix where each entry of A is multiplied by an independent (0, 1)-Bernoulli random variable with parameter 1/2. This paper is about when, how and why the non-Hermitian eigen-spectra of the matrices A(1) (A - A(1))* and (A - A(1))*A(1) captures more of the relevant information about the principal component structure of A than the eigen-spectra of AA* and A*A. We show that the eigenvalues of the asymmetric matrices A (A - A(1))* and (A - A(1))* A(1) with modulus greater than a detection threshold are asymptotically equal to the eigenvalues of P P* and P* P and that the associated eigenvectors are aligned as well. The central surprise is that by intentionally inducing asymmetry and additional randomness via the A(1) matrix, we can extract more information than if we had worked with the singular value decomposition (SVD) of A. The associated detection threshold is asymptotically exact and is non-universal since it explicitly depends on the element-wise distribution of the underlying matrix P. We show that reliable, statistically optimal but not perfect matrix recovery, via a universal data-driven algorithm, is possible above this detection threshold using the information extracted from the asymmetric eigen-decompositions. Averaging the left and right eigenvectors provably improves estimation accuracy but not the detection threshold. Our results encompass the very sparse regime where d is of order 1 where matrix completion via the SVD of A fails or produces unreliable recovery. We define another variant of this asymmetric principal component analysis procedure that bypasses the randomization step and has a detection threshold that
The stochastic block model (SBM) is a random graph model with planted clusters. It is widely employed as a canonical model to study clustering and community detection, and provides generally a fertile ground to study ...
详细信息
The stochastic block model (SBM) is a random graph model with planted clusters. It is widely employed as a canonical model to study clustering and community detection, and provides generally a fertile ground to study the statistical and computational tradeoffs that arise in network and data sciences. This note surveys the recent developments that establish the fundamental limits for community detection in the SBM, both with respect to information-theoretic and computational thresholds, and for various recovery requirements such as exact, partial and weak recovery (a.k.a., detection). The main results discussed are the phase transitions for exact recovery at the Chernoff-Hellinger threshold, the phase transition for weak recovery at the Kesten-Stigum threshold, the optimal distortion-SNR tradeoff for partial recovery, the learning of the SBM parameters and the gap between information-theoretic and computational thresholds. The note also covers some of the algorithms developed in the quest of achieving the limits, in particular two-round algorithms via graph-splitting, semi-definite programming, linearized belief propagation, classical and nonbacktracking spectral methods. A few open problems are also discussed.
Gradient method is a simple optimization approach using minus gradient of the objective function as a search direction. Its efficiency highly relies on the choices of the stepsize. In this paper, the convergence behav...
详细信息
Gradient method is a simple optimization approach using minus gradient of the objective function as a search direction. Its efficiency highly relies on the choices of the stepsize. In this paper, the convergence behavior of a class of gradient methods, where the stepsize has an important property introduced in (Dai in Optimization 52:395-415, 2003), is analyzed. Our analysis is focused on minimization on strictly convex quadratic functions. We establish the R-linear convergence and derive an estimate for the R-factor. Specifically, if the stepsize can be expressed as a collection of Rayleigh quotient of the inverse Hessian matrix, we are able to show that these methods converge R-linearly and their R-factors are bounded above by 1 - 1 1/x, where x is the associated condition number. Preliminary numerical results demonstrate the tightness of our estimate of the R-factor.
Given a measurement graph G = (V, E) and an unknown signal r is an element of R-n, we investigate algorithms for recovering r from pairwise measurements of the form r(i) - r(j);{i, j} is an element of E. This problem ...
详细信息
Given a measurement graph G = (V, E) and an unknown signal r is an element of R-n, we investigate algorithms for recovering r from pairwise measurements of the form r(i) - r(j);{i, j} is an element of E. This problem arises in a variety of applications, such as ranking teams in sports data and time synchronization of distributed networks. Framed in the context of ranking, the task is to recover the ranking of n teams (induced by r) given a small subset of noisy pairwise rank offsets. We propose a simple SVD-based algorithmic pipeline for both the problem of time synchronization and ranking. We provide a detailed theoretical analysis in terms of robustness against both sampling sparsity and noise perturbations with outliers, using results from matrix perturbation and random matrix theory. Our theoretical findings are complemented by a detailed set of numerical experiments on both synthetic and real data, showcasing the competitiveness of our proposed algorithms with other state-of-the-art methods.
暂无评论