We develop a message-passing algorithm for noisy matrix completion problems based on matrix factorization. The algorithm is derived by approximating message distributions of belief propagation with Gaussian distributi...
详细信息
We develop a message-passing algorithm for noisy matrix completion problems based on matrix factorization. The algorithm is derived by approximating message distributions of belief propagation with Gaussian distributions that share the same first and second moments. We also derive a memory-friendly version of the proposed algorithm by applying a perturbation treatment commonly used in the literature of approximate messagepassing. In addition, a damping technique, which is demonstrated to be crucial for optimal performance, is introduced without computational strain, and the relationship to the message-passing version of alternating least squares, a method reported to be optimal in certain settings, is discussed. Experiments on synthetic datasets show that while the proposed algorithm quantitatively exhibits almost the same performance under settings where the earlier algorithm is optimal, it is advantageous when the observed datasets are corrupted by non-Gaussian noise. Experiments on real-world datasets also emphasize the performance differences between the two algorithms.
Approximate messagepassing (AMP) has been shown to be an excellent statistical approach to signal inference and compressed sensing problems. The AMP framework provides modularity in the choice of signal prior;here we...
详细信息
Approximate messagepassing (AMP) has been shown to be an excellent statistical approach to signal inference and compressed sensing problems. The AMP framework provides modularity in the choice of signal prior;here we propose a hierarchical form of the Gauss-Bernoulli prior which utilizes a restricted Boltzmann machine (RBM) trained on the signal support to push reconstruction performance beyond that of simple i.i.d. priors for signals whose support can be well represented by a trained binary RBM. We present and analyze two methods of RBM factorization and demonstrate how these affect signal reconstruction performance within our proposed algorithm. Finally, using the MNIST handwritten digit dataset, we show experimentally that using an RBM allows AMP to approach oracle-support performance.
The problem of aligning Erdos-Renyi random graphs is a noisy, average-case version of the graph isomorphism problem, in which a pair of correlated random graphs is observed through a random permutation of their vertic...
详细信息
The problem of aligning Erdos-Renyi random graphs is a noisy, average-case version of the graph isomorphism problem, in which a pair of correlated random graphs is observed through a random permutation of their vertices. We study a polynomial time message-passing algorithm devised to solve the inference problem of partially recovering the hidden permutation, in the sparse regime with constant average degrees. We perform extensive numerical simulations to determine the range of parameters in which this algorithm achieves partial recovery. We also introduce a generalized ensemble of correlated random graphs with prescribed degree distributions, and extend the algorithm to this case.
We use the cavity method to study the parallel dynamics of disordered Ising models on a graph. In particular, we derive a set of recursive equations in single-site probabilities of paths propagating along the edges of...
详细信息
We use the cavity method to study the parallel dynamics of disordered Ising models on a graph. In particular, we derive a set of recursive equations in single-site probabilities of paths propagating along the edges of the graph. These equations are analogous to the cavity equations for equilibrium models and are exact on a tree. On graphs with exclusively directed edges we find an exact expression for the stationary distribution. We present the phase diagrams for an Ising model on an asymmetric Bethe lattice and for a neural network with Hebbian interactions on an asymmetric scale-free graph. For graphs with a nonzero fraction of symmetric edges the equations can be solved for a finite number of time steps. Theoretical predictions are confirmed by simulations. Using a heuristic method the cavity equations are extended to a set of equations that determine the marginals of the stationary distribution of Ising models on graphs with a nonzero fraction of symmetric edges. The results from this method are discussed and compared with simulations.
Over-parametrization was a crucial ingredient for recent developments in inference and machine-learning fields. However a good theory explaining this success is still lacking. In this paper we study a very simple case...
详细信息
Over-parametrization was a crucial ingredient for recent developments in inference and machine-learning fields. However a good theory explaining this success is still lacking. In this paper we study a very simple case of mismatched over-parametrized algorithm applied to one of the most studied inference problem: the planted clique problem. We analyze a Monte Carlo (MC) algorithm in the same class of the famous Jerrum algorithm. We show how this MC algorithm is in general suboptimal for the recovery of the planted clique. We show however how to enhance its performances by adding a (mismatched) parameter: the temperature;we numerically find that this over-parametrized version of the algorithm can reach the supposed algorithmic threshold for the planted clique problem.
Neural networks are able to extract information from the timing of spikes. Here we provide new results on the behavior of the simplest neuronal model which is able to decode information embedded in temporal spike patt...
详细信息
Neural networks are able to extract information from the timing of spikes. Here we provide new results on the behavior of the simplest neuronal model which is able to decode information embedded in temporal spike patterns, the so-called tempotron. Using statistical physics techniques we compute the capacity for the case of sparse, time-discretized input, and 'material' discrete synapses, showing that the device saturates the information theoretic bounds with a statistics of output spikes that is consistent with the statistics of the inputs. We also derive two simple and highly efficient learning algorithms which are able to learn a number of associations which are close to the theoretical limit. The simplest versions of these algorithms correspond to distributed online protocols of interest for neuromorphic devices, and can be adapted to address the more biologically relevant continuous-time version of the classification problem, hopefully allowing the understanding of some aspects of synaptic plasticity.
Graphical models represent multivariate and generally not normalized probability distributions. Computing the normalization factor, called the partition function, is the main inference challenge relevant to multiple s...
详细信息
Graphical models represent multivariate and generally not normalized probability distributions. Computing the normalization factor, called the partition function, is the main inference challenge relevant to multiple statistical and optimization applications. The problem is #P-hard that is of an exponential complexity with respect to the number of variables. In this manuscript, aimed at approximating the partition function, we consider multi-graph models where binary variables and multivariable factors are associated with edges and nodes, respectively, of an undirected multi-graph. We suggest a new methodology for analysis and computations that combines the Gauge function technique from Chertkov and Chernyak (2006 Phys. Rev. E 73 065102;2006 J. Stat. Mech. 2006 P06009) with the technique developed in Anari and Oveis Gharan 2017 arXiv:;Gurvits 2011 arXiv:;Straszak and Vishnoi 2017 55th Annual Allerton Conf. on Communication, Control, and Computing, based on the recent progress in the field of real stable polynomials. We show that the Gauge function, representing a single-out term in a finite sum expression for the partition function which achieves extremum at the so-called belief-propagation gauge, has a natural polynomial representation in terms of gauges/variables associated with edges of the multi-graph. Moreover, Gauge function can be used to recover the partition function through a sequence of transformations allowing appealing algebraic and graphical interpretations. Algebraically, one step in the sequence consists of the application of a differential operator over gauges associated with an edge. Graphically, the sequence is interpreted as a repetitive elimination/contraction of edges resulting in multi-graph models on decreasing in size (number of edges) graphs with the same partition function as in the original multi-graph model. Even though the complexity of computing factors in the sequence of the derived multi-graph models and respective Gauge functions gro
We consider the problem of estimating the input and hidden variables of a stochastic multi-layer neural network (NN) from an observation of the output. The hidden variables in each layer are represented as matrices wi...
详细信息
We consider the problem of estimating the input and hidden variables of a stochastic multi-layer neural network (NN) from an observation of the output. The hidden variables in each layer are represented as matrices with statistical interactions along both rows as well as columns. This problem applies to matrix imputation, signal recovery via deep generative prior models, multi-task and mixed regression, and learning certain classes of two-layer NNs. We extend a recently-developed algorithm-multi-layer vector approximate messagepassing, for this matrix-valued inference problem. It is shown that the performance of the proposed multi-layer matrix vector approximate messagepassing algorithm can be exactly predicted in a certain random large-system limit, where the dimensions N x d of the unknown quantities grow as N -> infinity with d fixed. In the two-layer neural-network learning problem, this scaling corresponds to the case where the number of input features as well as training samples grow to infinity but the number of hidden nodes stays fixed. The analysis enables a precise prediction of the parameter and test error of the learning.
We analyse a linear regression problem with nonconvex regularization called smoothly clipped absolute deviation (SCAD) under an overcomplete Gaussian basis for Gaussian random data. We propose an approximate message p...
详细信息
We analyse a linear regression problem with nonconvex regularization called smoothly clipped absolute deviation (SCAD) under an overcomplete Gaussian basis for Gaussian random data. We propose an approximate messagepassing (AMP) algorithm considering nonconvex regularization, namely SCAD-AMP, and analytically show that the stability condition corresponds to the de Almeida-Thouless condition in spin glass literature. Through asymptotic analysis, we show the correspondence between the density evolution of SCAD-AMP and the replica symmetric (RS) solution. Numerical experiments confirm that for a su. ciently large system size, SCAD-AMP achieves the optimal performance predicted by the replica method. Through replica analysis, a phase transition between replica symmetric and replica symmetry breaking (RSB) region is found in the parameter space of SCAD. The appearance of the RS region for a nonconvex penalty is a significant advantage that indicates the region of smooth landscape of the optimization problem. Furthermore, we analytically show that the statistical representation performance of the SCAD penalty is better than that of l(1)-based methods, and the minimum representation error under RS assumption is obtained at the edge of the RS/RSB phase. The correspondence between the convergence of the existing coordinate descent algorithm and RS/RSB transition is also indicated.
We introduce a version of the cavity method for diluted mean field spin models that allows the computation of thermodynamic quantities similar to the Franz-Parisi quenched potential in sparse random graph models. This...
详细信息
We introduce a version of the cavity method for diluted mean field spin models that allows the computation of thermodynamic quantities similar to the Franz-Parisi quenched potential in sparse random graph models. This method is developed in the particular case of partially decimated random constraint satisfaction problems. This allows us to develop a theoretical understanding of a class of algorithms for solving constraint satisfaction problems, in which elementary degrees of freedom are sequentially assigned according to the results of a messagepassing procedure (belief propagation). We confront this theoretical analysis with the results of extensive numerical simulations.
暂无评论