The minimal dominating set (MDS) problem is a prototypical hard combinatorial optimization problem. We recently studied this problem using the cavity method. Although we obtained a solution for a given graph that give...
详细信息
The minimal dominating set (MDS) problem is a prototypical hard combinatorial optimization problem. We recently studied this problem using the cavity method. Although we obtained a solution for a given graph that gives a very good estimation of the minimal dominating size, we do not know whether there is a ground state solution or how many solutions exist in the ground state. We have therefore continued to develop a one-step replica symmetry breaking theory to investigate the ground state energy of the MDS problem. First, we find that the solution space for the MDS problem exhibits both condensation transition and cluster transition on regular random graphs, and prove this using a simulated annealing dynamical process. Second, we develop a zero-temperature survey propagation algorithm on Erdos-Renyi random graphs to estimate the ground state energy, and obtain a survey propagation decimation algorithm that achieves results as good as the belief propagation decimation algorithm.
The effective use of limited resources for controlling spreading processes on networks is of prime significance in diverse contexts, ranging from the identification of "influential spreaders" for maximizing ...
详细信息
The effective use of limited resources for controlling spreading processes on networks is of prime significance in diverse contexts, ranging from the identification of "influential spreaders" for maximizing information dissemination and targeted interventions in regulatory networks, to the development of mitigation policies for infectious diseases and financial contagion in economic systems. Solutions for these optimization tasks that are based purely on topological arguments are not fully satisfactory;in realistic settings, the problem is often characterized by heterogeneous interactions and requires interventions in a dynamic fashion over a finite time window via a restricted set of controllable nodes. The optimal distribution of available resources hence results from an interplay between network topology and spreading dynamics. We show how these problems can be addressed as particular instances of a universal analytical framework based on a scalable dynamic message-passing approach and demonstrate the efficacy of the method on a variety of real-world examples.
We analyse a linear regression problem with nonconvex regularization called smoothly clipped absolute deviation (SCAD) under an overcomplete Gaussian basis for Gaussian random data. We propose an approximate message p...
详细信息
We analyse a linear regression problem with nonconvex regularization called smoothly clipped absolute deviation (SCAD) under an overcomplete Gaussian basis for Gaussian random data. We propose an approximate messagepassing (AMP) algorithm considering nonconvex regularization, namely SCAD-AMP, and analytically show that the stability condition corresponds to the de Almeida-Thouless condition in spin glass literature. Through asymptotic analysis, we show the correspondence between the density evolution of SCAD-AMP and the replica symmetric (RS) solution. Numerical experiments confirm that for a su. ciently large system size, SCAD-AMP achieves the optimal performance predicted by the replica method. Through replica analysis, a phase transition between replica symmetric and replica symmetry breaking (RSB) region is found in the parameter space of SCAD. The appearance of the RS region for a nonconvex penalty is a significant advantage that indicates the region of smooth landscape of the optimization problem. Furthermore, we analytically show that the statistical representation performance of the SCAD penalty is better than that of l(1)-based methods, and the minimum representation error under RS assumption is obtained at the edge of the RS/RSB phase. The correspondence between the convergence of the existing coordinate descent algorithm and RS/RSB transition is also indicated.
We revisit classical bounds of Fisher on the ferromagnetic Ising model (Fisher 1967 Phys. Rev. 162 480), and show how to efficiently use them on an arbitrary given graph to rigorously upper-bound the partition functio...
详细信息
We revisit classical bounds of Fisher on the ferromagnetic Ising model (Fisher 1967 Phys. Rev. 162 480), and show how to efficiently use them on an arbitrary given graph to rigorously upper-bound the partition function, magnetizations, and correlations. The results are valid on any finite graph, with arbitrary topology and arbitrary positive couplings and fields. Our results are based on high temperature expansions of the aforementioned quantities, and are expressed in terms of two related linear operators: the non-backtracking operator and the Bethe Hessian. As a by-product, we show that in a well-defined high-temperature region, the susceptibility propagation algorithm (Mezard 2009 J. Physiol. 103 107-13) converges and provides an upper bound on the true spin-spin correlations.
This article is an extended version of previous work of Lesieur et al (2015 IEEE Int. Symp. on Information Theory Proc. pp 1635-9 and 2015 53rd Annual Allerton Conf. on Communication, Control and Computing (IEEE) pp 6...
详细信息
This article is an extended version of previous work of Lesieur et al (2015 IEEE Int. Symp. on Information Theory Proc. pp 1635-9 and 2015 53rd Annual Allerton Conf. on Communication, Control and Computing (IEEE) pp 680-7) on low-rank matrix estimation in the presence of constraints on the factors into which the matrix is factorized. Low-rank matrix factorization is one of the basic methods used in data analysis for unsupervised learning of relevant features and other types of dimensionality reduction. We present a framework to study the constrained low-rank matrix estimation for a general prior on the factors, and a general output channel through which the matrix is observed. We draw a parallel with the study of vector-spin glass models-presenting a unifying way to study a number of problems considered previously in separate statistical physics works. We present a number of applications for the problem in data analysis. We derive in detail a general form of the low-rank approximate messagepassing (Low-RAMP) algorithm, that is known in statistical physics as the TAP equations. We thus unify the derivation of the TAP equations for models as di. erent as the Sherrington-Kirkpatrick model, the restricted Boltzmann machine, the Hopfield model or vector (xy, Heisenberg and other) spin glasses. The state evolution of the Low-RAMP algorithm is also derived, and is equivalent to the replica symmetric solution for the large class of vector-spin glass models. In the section devoted to result we study in detail phase diagrams and phase transitions for the Bayesoptimal inference in low-rank matrix estimation. We present a typology of phase transitions and their relation to performance of algorithms such as the Low-RAMP or commonly used spectral methods.
Community structure is an important feature of networks, and the correct detection of communities is a fundamental problem in network analysis. Statistical inference has recently been proposed for successful detection...
详细信息
Community structure is an important feature of networks, and the correct detection of communities is a fundamental problem in network analysis. Statistical inference has recently been proposed for successful detection, provided the number of communities can be appropriately estimated a priori. In the absence of such information, model selection by determination of the number of communities remains an issue. We show here that correlation between communities from a highly parceled partition can be used to estimate a narrow range of variation for the real number of communities. This range, further elaborated by modularity-based belief propagation, correctly identifies communities. Testing on synthetic networks generated by a stochastic block model and a set of real-world networks shows that our method can alleviate the effects of modularity fluctuations well and enhance the ability of community detection of the bare modularity-based belief propagation method.
In an era of unprecedented deluge of (mostly unstructured) data, graphs are proving more and more useful, across the sciences, as a flexible abstraction to capture complex relationships between com- plex objects. One ...
详细信息
In an era of unprecedented deluge of (mostly unstructured) data, graphs are proving more and more useful, across the sciences, as a flexible abstraction to capture complex relationships between com- plex objects. One of the main challenges arising in the study of such networks is the inference of macroscopic, large-scale properties af- fecting a large number of objects, based solely on the microscopic in- teractions between their elementary constituents. Statistical physics, precisely created to recover the macroscopic laws of thermodynamics from an idealized model of interacting particles, provides significant insight to tackle such complex networks. In this dissertation, we use methods derived from the statistical physics of disordered systems to design and study new algorithms for inference on graphs. Our focus is on spectral methods, based on certain eigenvectors of carefully chosen matrices, and sparse graphs, containing only a small amount of information. We develop an origi- nal theory of spectral inference based on a relaxation of various mean- field free energy optimizations. Our approach is therefore fully prob- abilistic, and contrasts with more traditional motivations based on the optimization of a cost function. We illustrate the efficiency of our approach on various problems, including community detection, randomized similarity-based clustering, and matrix completion.
The optimization version of the cavity method for single instances, called Max-Sum, has been applied in the past to the minimum Steiner tree problem on graphs and variants. Max-Sum has been shown experimentally to giv...
详细信息
The optimization version of the cavity method for single instances, called Max-Sum, has been applied in the past to the minimum Steiner tree problem on graphs and variants. Max-Sum has been shown experimentally to give asymptotically optimal results on certain types of weighted random graphs, and to give good solutions in short computation times for some types of real networks. However, the hypotheses behind the formulation and the cavity method itself limit substantially the class of instances on which the approach gives good results (or even converges). Moreover, in the standard model formulation, the diameter of the tree solution is limited by a predefined bound, that a. ects both computation time and convergence properties. In this work we describe two main enhancements to the Max-Sum equations to be able to cope with optimization of real-world instances. First, we develop an alternative 'flat' model formulation that allows the relevant configuration space to be reduced substantially, making the approach feasible on instances with large solution diameter, in particular when the number of terminal nodes is small. Second, we propose an integration between Max-Sum and three greedy heuristics. This integration allows Max-Sum to be transformed into a highly competitive self-contained algorithm, in which a feasible solution is given at each step of the iterative procedure. Part of this development participated in the 2014 DIMACS Challenge on Steiner problems, and we report the results here. The performance on the challenge of the proposed approach was highly satisfactory: it maintained a small gap to the best bound in most cases, and obtained the best results on several instances in two different categories. We also present several improvements with respect to the version of the algorithm that participated in the competition, including new best solutions for some of the instances of the challenge.
In multiple scientific and technological applications we face the problem of having low dimensional data to be justified by a linear model defined in a high dimensional parameter space. The difference in dimensionalit...
详细信息
In multiple scientific and technological applications we face the problem of having low dimensional data to be justified by a linear model defined in a high dimensional parameter space. The difference in dimensionality makes the problem ill-defined: the model is consistent with the data for many values of its parameters. The objective is to find the probability distribution of parameter values consistent with the data, a problem that can be cast as the exploration of a high dimensional convex polytope. In this work we introduce a novel algorithm to solve this problem efficiently. It provides results that are statistically indistinguishable from currently used numerical techniques while its running time scales linearly with the system size. We show that the algorithm performs robustly in many abstract and practical applications. As working examples we simulate the effects of restricting reaction fluxes on the space of feasible phenotypes of a genome scale Escherichia coli metabolic network and infer the traffic flow between origin and destination nodes in a real communication network.
A framework based on generalized hierarchical random graphs (GHRGs) for the detection of change points in the structure of temporal networks has recently been developed by Peel and Clauset (2015 Proc. 29th AAAI Conf. ...
详细信息
A framework based on generalized hierarchical random graphs (GHRGs) for the detection of change points in the structure of temporal networks has recently been developed by Peel and Clauset (2015 Proc. 29th AAAI Conf. on Artificial Intelligence). We build on this methodology and extend it to also include the versatile stochastic block models (SBMs) as a parametric family for reconstructing the empirical networks. We use five different techniques for change point detection on prototypical temporal networks, including empirical and synthetic ones. We find that none of the considered methods can consistently outperform the others when it comes to detecting and locating the expected change points in empirical temporal networks. With respect to the precision and the recall of the results of the change points, we find that the method based on a degree-corrected SBM has better recall properties than other dedicated methods, especially for sparse networks and smaller sliding time window widths.
暂无评论