Random constraint satisfaction problems undergo several phase transitions as the ratio between the number of constraints and the number of variables is varied. When this ratio exceeds the satisfiability threshold no m...
详细信息
Random constraint satisfaction problems undergo several phase transitions as the ratio between the number of constraints and the number of variables is varied. When this ratio exceeds the satisfiability threshold no more solutions exist;the satisfiable phase, for less constrained problems, is itself divided in an unclustered regime and a clustered one. In the latter solutions are grouped in clusters of nearby solutions separated in configuration space from solutions of other clusters. In addition the rigidity transition signals the appearance of so-called frozen variables in typical solutions: beyond this threshold most solutions belong to clusters with an extensive number of variables taking the same values in all solutions of the cluster. In this paper we refine the description of this phenomenon by estimating the location of the freezing transition, corresponding to the disappearance of all unfrozen solutions (not only typical ones). We also unveil phase transitions for the existence and uniqueness of locked solutions, in which all variables are frozen. From a technical point of view we characterize atypical solutions with a number of frozen variables different from the typical value via a large deviation study of the dynamics of a stripping process (whitening) that unveils the frozen variables of a solution, building upon recent works on atypical trajectories of the bootstrap percolation dynamics. Our results also bear some relevance from an algorithmic perspective, previous numerical studies having shown that heuristic algorithms of various kinds usually output unfrozen solutions.
A signal model called joint sparse model 2 (JSM-2) or the multiple measurement vector problem, in which all sparse signals share their support, is important for dealing with practical signal processing problems. In th...
详细信息
A signal model called joint sparse model 2 (JSM-2) or the multiple measurement vector problem, in which all sparse signals share their support, is important for dealing with practical signal processing problems. In this paper, we investigate the typical reconstruction performance of noisy measurement JSM-2 problems for l(2,1)-norm regularized least square reconstruction and the Bayesian optimal reconstruction scheme in terms of mean square error. Employing the replica method, we show that these schemes, which exploit the knowledge of the sharing of the signal support, can recover the signals more precisely as the number of channels increases. In addition, we compare the reconstruction performance of two different ensembles of observation matrices: one is composed of independent and identically distributed random Gaussian entries and the other is designed so that row vectors are orthogonal to one another. As reported for the single-channel case in earlier studies, our analysis indicates that the latter ensemble offers better performance than the former ones for the noisy JSM-2 problem. The results of numerical experiments with a computationally feasible approximation algorithm we developed for this study agree with the theoretical estimation.
We consider a one-step replica symmetry breaking description of the Edwards-Anderson spin glass model in 2D. The ingredients of this description are a Kikuchi approximation to the free energy and a second-level statis...
详细信息
We consider a one-step replica symmetry breaking description of the Edwards-Anderson spin glass model in 2D. The ingredients of this description are a Kikuchi approximation to the free energy and a second-level statistical model built on the extremal points of the Kikuchi approximation, which are also fixed points of a generalized belief propagation (GBP) scheme. We show that a generalized free energy can be constructed where these extremal points are exponentially weighted by their Kikuchi free energy and a Parisi parameter y, and that the Kikuchi approximation of this generalized free energy leads to second-level, one-step replica symmetry breaking (1RSB), GBP equations. We then proceed analogously to the Bethe approximation case for tree-like graphs, where it has been shown that 1RSB belief propagation equations admit a survey propagation solution. We discuss when and how the one-step-replica symmetry breaking GBP equations that we obtain also allow a simpler class of solutions which can be interpreted as a class of generalized survey propagation equations for the single instance graph case.
Two concepts of centrality have been defined in complex networks. The first considers the centrality of a node and many different metrics for it have been defined (e.g. eigenvector centrality, PageRank, non-backtracki...
详细信息
Two concepts of centrality have been defined in complex networks. The first considers the centrality of a node and many different metrics for it have been defined (e.g. eigenvector centrality, PageRank, non-backtracking centrality, etc). The second is related to large scale organization of the network, the core-periphery structure, composed by a dense core plus an outlying and loosely-connected periphery. In this paper we investigate the relation between these two concepts. We consider networks generated via the stochastic block model, or its degree corrected version, with a core-periphery structure and we investigate the centrality properties of the core nodes and the ability of several centrality metrics to identify them. We find that the three measures with the best performance are marginals obtained with belief propagation, PageRank, and degree centrality, while non-backtracking and eigenvector centrality (or MINRES [10], showed to be equivalent to the latter in the large network limit) perform worse in the investigated networks.
Circular coloring is a constraint satisfaction problem where colors are assigned to nodes in a graph in such a way that every pair of connected nodes has two consecutive colors (the first color being consecutive to th...
详细信息
Circular coloring is a constraint satisfaction problem where colors are assigned to nodes in a graph in such a way that every pair of connected nodes has two consecutive colors (the first color being consecutive to the last). We study circular coloring of random graphs using the cavity method. We identify two very interesting properties of this problem. For sufficiently many color and sufficiently low temperature there is a spontaneous breaking of the circular symmetry between colors and a phase transition forwards a ferromagnet-like phase. Our second main result concerns 5-circular coloring of random 3-regular graphs. While this case is found colorable, we conclude that the description via one-step replica symmetry breaking is not sufficient. We observe that simulated annealing is very efficient to find proper colorings for this case. The 5-circular coloring of 3-regular random graphs thus provides a first known example of a problem where the ground state energy is known to be exactly zero yet the space of solutions probably requires a full-step replica symmetry breaking treatment.
作者:
Zhou, Hai-JunChinese Acad Sci
Key Lab Theoret Phys Inst Theoret Phys Zhong Guan Cun East Rd 55 Beijing 100190 Peoples R China
A directed graph (digraph) is formed by vertices and arcs (directed edges) from one vertex to another. A feedback vertex set (FVS) is a set of vertices that contains at least one vertex of every directed cycle in this...
详细信息
A directed graph (digraph) is formed by vertices and arcs (directed edges) from one vertex to another. A feedback vertex set (FVS) is a set of vertices that contains at least one vertex of every directed cycle in this digraph. The directed feedback vertex set problem aims at constructing a FVS of minimum cardinality. This is a fundamental cycle-constrained hard combinatorial optimization problem with wide practical applications. In this paper we construct a spin glass model for the directed FVS problem by converting the global cycle constraints into local arc constraints, and study this model through the replica-symmetric (RS) mean field theory of statistical physics. We then implement a belief propagation-guided decimation (BPD) algorithm for single digraph instances. The BPD algorithm slightly outperforms the simulated annealing algorithm on large random graph instances. The RS mean field results and algorithmic results can be further improved by working on a more restrictive (and more difficult) spin glass model.
Approximate messagepassing (AMP) has been shown to be an excellent statistical approach to signal inference and compressed sensing problems. The AMP framework provides modularity in the choice of signal prior;here we...
详细信息
Approximate messagepassing (AMP) has been shown to be an excellent statistical approach to signal inference and compressed sensing problems. The AMP framework provides modularity in the choice of signal prior;here we propose a hierarchical form of the Gauss-Bernoulli prior which utilizes a restricted Boltzmann machine (RBM) trained on the signal support to push reconstruction performance beyond that of simple i.i.d. priors for signals whose support can be well represented by a trained binary RBM. We present and analyze two methods of RBM factorization and demonstrate how these affect signal reconstruction performance within our proposed algorithm. Finally, using the MNIST handwritten digit dataset, we show experimentally that using an RBM allows AMP to approach oracle-support performance.
We investigate leave-one-out cross validation (CV) as a determinator of the weight of the penalty term in the least absolute shrinkage and selection operator (LASSO). First, on the basis of the messagepassing algorit...
详细信息
We investigate leave-one-out cross validation (CV) as a determinator of the weight of the penalty term in the least absolute shrinkage and selection operator (LASSO). First, on the basis of the messagepassing algorithm and a perturbative discussion assuming that the number of observations is sufficiently large, we provide simple formulas for approximately assessing two types of CV errors, which enable us to significantly reduce the necessary cost of computation. These formulas also provide a simple connection of the CV errors to the residual sums of squares between the reconstructed and the given measurements. Second, on the basis of this finding, we analytically evaluate the CV errors when the design matrix is given as a simple random matrix in the large size limit by using the replica method. Finally, these results are compared with those of numerical simulations on finite-size systems and are confirmed to be correct. We also apply the simple formulas of the first type of CV error to an actual dataset of the supernovae.
This paper explores combinatorial optimization for problems of max-weight graph matching on multi-partite graphs, which arise in integrating multiple data sources. In the most common two-source case, it is often desir...
详细信息
This paper explores combinatorial optimization for problems of max-weight graph matching on multi-partite graphs, which arise in integrating multiple data sources. In the most common two-source case, it is often desirable for the final matching to be one-to-one;the database and statistical record linkage communities accomplish this by weighted bipartite graph matching on similarity scores. Such matchings are intuitively appealing: they leverage a natural global property of many real-world entity stores-that of being nearly deduped-and are known to provide significant improvements to precision and recall. Unfortunately, unlike the bipartite case, exact max-weight matching on multi-partite graphs is known to be NP-hard. Our two-fold algorithmic contributions approximate multi-partite max-weight matching: our first algorithm borrows optimization techniques common to Bayesian probabilistic inference;our second is a greedy approximation algorithm. In addition to a theoretical guarantee on the latter, we present comparisons on a real-world entity resolution problem from Bing significantly larger than typically found in the literature, on publication data, and on a series of synthetic problems. Our results quantify significant improvements due to exploiting multiple sources, which are made possible by global one-to-one constraints linking otherwise independent matching sub-problems. We also discover that our algorithms are complementary: one being much more robust under noise, and the other being simple to implement and very fast to run.
message-passing algorithms based on belief-propagation (BP) are successfully used in many applications, including decoding error correcting codes and solving constraint satisfaction and inference problems. The BP-base...
详细信息
message-passing algorithms based on belief-propagation (BP) are successfully used in many applications, including decoding error correcting codes and solving constraint satisfaction and inference problems. The BP-based algorithms operate over graph representations, called factor graphs, that are used to model the input. Although in many cases, the BP-based algorithms exhibit impressive empirical results, not much has been proved when the factor graphs have cycles. This paper deals with packing and covering integer programs in which the constraint matrix is zero-one, the constraint vector is integral, and the variables are subject to box constraints. We study the performance of the min-sum algorithm when applied to the corresponding factor graph models of packing and covering linear programmings (LPs). We compare the solutions computed by the min-sum algorithm for packing and covering problems to the optimal solutions of the corresponding LP relaxations. In particular, we prove that if the LP has an optimal fractional solution, then for each fractional component, the minsum algorithm either computes multiple solutions or the solution oscillates below and above the fraction. This implies that the min-sum algorithm computes the optimal integral solution only if the LP has a unique optimal solution that is integral. The converse is not true in general. For a special case of packing and covering problems, we prove that if the LP has a unique optimal solution that is integral and on the boundary of the box constraints, then the min-sum algorithm computes the optimal solution in pseudopolynomial time. Our results unify and extend recent results for the maximum weight matching problem and for the maximum weight independent set problem.
暂无评论