One of the challenges practitioners face when applying structure learning algorithms to their data involves determining a set of hyperparameters;otherwise, a set of hyperparameter defaults is assumed. The optimal hype...
详细信息
One of the challenges practitioners face when applying structure learning algorithms to their data involves determining a set of hyperparameters;otherwise, a set of hyperparameter defaults is assumed. The optimal hyperparameter configuration often depends on multiple factors, including the size and density of the usually unknown underlying true graph, the sample size of the input data, and the structure learning algorithm. We propose a novel hyperparameter tuning method, called the Out-of-sample Tuning for Structure Learning (OTSL), that employs out-of-sample and resampling strategies to estimate the optimal hyperparameter configuration for structure learning, given the input dataset and structure learning algorithm. Synthetic experiments show that employing OTSL to tune the hyperparameters of hybrid and score-based structure learning algorithms leads to improvements in graphical accuracy compared to the state-of-the-art. We also illustrate the applicability of this approach to real datasets from different disciplines.
This paper presents an innovative model of a program's internal behavior over a set of test inputs, called the probabilistic program dependence graph (PPDG), which facilitates probabilistic analysis and reasoning ...
详细信息
This paper presents an innovative model of a program's internal behavior over a set of test inputs, called the probabilistic program dependence graph (PPDG), which facilitates probabilistic analysis and reasoning about uncertain program behavior, particularly that associated with faults. The PPDG construction augments the structural dependences represented by a program dependence graph with estimates of statistical dependences between node states, which are computed from the test set. The PPDG is based on the established framework of probabilistic graphical models, which are used widely in a variety of applications. This paper presents algorithms for constructing PPDGs and applying them to fault diagnosis. The paper also presents preliminary evidence indicating that a PPDG-based fault localization technique compares favorably with existing techniques. The paper also presents evidence indicating that PPDGs can be useful for fault comprehension.
Automatically learning the graph structure of a single Bayesian network (BN) which accurately represents the underlying multivariate probability distribution of a collection of random variables is a challenging task. ...
详细信息
Automatically learning the graph structure of a single Bayesian network (BN) which accurately represents the underlying multivariate probability distribution of a collection of random variables is a challenging task. But obtaining a Bayesian solution to this problem based on computing the posterior probability of the presence of any edge or any directed path between two variables or any other structural feature is a much more involved problem, since it requires averaging over all the possible graph structures. For the former problem, recent advances have shown that search + score approaches find much more accurate structures if the search is constrained by a previously inferred skeleton (i.e. a relaxed structure with undirected edges which can be inferred using local search based methods). Based on similar ideas, we propose two novel skeleton-based approaches to approximate a Bayesian solution to the BN learning problem: a new stochastic search which tries to find directed acyclic graph (DAG) structures with a non-negligible score;and a new Markov chain Monte Carlo method over the DAG space. These two approaches are based on the same idea. In a first step, both employ a previously given skeleton and build a Bayesian solution constrained by this skeleton. In a second step, using the preliminary solution, they try to obtain a new Bayesian approximation but this time in an unconstrained graph space, which is the final outcome of the methods. As shown in the experimental evaluation, this new approach strongly boosts the performance of these two standard techniques proving that the idea of employing a skeleton to constrain the model space is also a successful strategy for performing Bayesian structure learning of BNs. (C) 2012 Elsevier B.V. All rights reserved.
In this paper, we propose a probabilisticgraphical model to represent weakly annotated images. We consider an image as weakly annotated if the number of keywords defined for it is less than the maximum number defined...
详细信息
In this paper, we propose a probabilisticgraphical model to represent weakly annotated images. We consider an image as weakly annotated if the number of keywords defined for it is less than the maximum number defined in the ground truth. This model is used to classify images and automatically extend existing annotations to new images by taking into account semantic relations between keywords. The proposed method has been evaluated in visual-textual classification and automatic annotation of images. The visual-textual classification is performed by using both visual and textual information. The experimental results, obtained from a database of more than 30,000 images, show an improvement by 50.5% in terms of recognition rate against only visual information classification. Taking into account semantic relations between keywords improves the recognition rate by 10.5%. Moreover. the proposed model can be used to extend existing annotations to weakly annotated images, by computing distributions of missing keywords. Semantic relations improve the mean rate of good annotations by 6.9%. Finally, the proposed method is competitive with a state-of-art model. (C) 2010 Elsevier Inc. All rights reserved.
We introduce the structural interface algorithm for exact probabilistic inference in Dynamic Bayesian Networks. It unifies state-of-the-art techniques for inference in static and dynamic networks, by combining princip...
详细信息
We introduce the structural interface algorithm for exact probabilistic inference in Dynamic Bayesian Networks. It unifies state-of-the-art techniques for inference in static and dynamic networks, by combining principles of knowledge compilation with the interface algorithm. The resulting algorithm not only exploits the repeated structure in the network, but also the local structure, including determinism, parameter equality and context-specific independence. Empirically, we show that the structural interface algorithm speeds up inference in the presence of local structure, and scales to larger and more complex networks. (C) 2015 Elsevier B.V. All rights reserved.
Reliability analysis has become an integral part of system design and operation. This is especially true for systems performing critical tasks, such as mass transportation systems. This explains the numerous advances ...
详细信息
Reliability analysis has become an integral part of system design and operation. This is especially true for systems performing critical tasks, such as mass transportation systems. This explains the numerous advances in the field of reliability modeling. More recently, some studies involving the use of Bayesian networks have been proven relevant to represent complex systems and perform reliability studies. In previous works, a generic methodology was introduced for developing a decision support tool to evaluate complex systems maintenance strategies. This article deals with development of such a decision tool dedicated to the maintenance of Paris metro rails. Indeed, owing to fulfillment of high-performance levels of safety and availability (the latter being especially critical at peak hours), operators need to estimate, hour by hour their ability to prevent or to detect broken rails. To address this problem, a decision support tool was developed, the aim of this article is to evaluate, compare and optimize various operating and maintenance strategies.
Recently, multiview learning has been increasingly focused on machine learning. However, most existing multiview learning methods cannot directly deal with multiview sequential data, in which the inherent dynamical st...
详细信息
Recently, multiview learning has been increasingly focused on machine learning. However, most existing multiview learning methods cannot directly deal with multiview sequential data, in which the inherent dynamical structure is often ignored. Especially, most traditional multiview machine learning methods assume that the items at different time slices within a sequence are independent of each other. In order to solve this problem, we propose a new multiview discriminant model based on conditional random fields (CRFs) to model multiview sequential data, called multiview CRF. It inherits the advantages of CRFs that build a relationship between items in each sequence. Moreover, by introducing specific features designed on the CRFs for multiview data, the multiview CRF not only considers the relationship among different views but also captures the correlation between the features from the same view. Particularly, some features can be reused or divided into different views to build an appropriate size of feature space. This helps to avoid underfitting problems caused by too small feature space or overfitting problems caused by too large feature space. In order to handle large-scale data, we use the stochastic gradient method to speed up our model. The experimental results on the text and video data illustrate the superiority of the proposed model.
Attack graphs are a powerful tool for security risk assessment by analysing network vulnerabilities and the paths attackers can use to compromise network resources. The uncertainty about the attacker's behaviour m...
详细信息
Attack graphs are a powerful tool for security risk assessment by analysing network vulnerabilities and the paths attackers can use to compromise network resources. The uncertainty about the attacker's behaviour makes Bayesian networks suitable to model attack graphs to perform static and dynamic analysis. Previous approaches have focused on the formalization of attack graphs into a Bayesian model rather than proposing mechanisms for their analysis. In this paper we propose to use efficient algorithms to make exact inference in Bayesian attack graphs, enabling the static and dynamic network risk assessments. To support the validity of our approach we have performed an extensive experimental evaluation on synthetic Bayesian attack graphs with different topologies, showing the computational advantages in terms of time and memory use of the proposed techniques when compared to existing approaches.
Algorithms for inferring the structure of Bayesian networks from data have become an increasingly popular method for uncovering the direct and indirect influences among variables in complex systems. A Bayesian approac...
详细信息
Algorithms for inferring the structure of Bayesian networks from data have become an increasingly popular method for uncovering the direct and indirect influences among variables in complex systems. A Bayesian approach to structure learning uses posterior probabilities to quantify the strength with which the data and prior knowledge jointly support each possible graph feature. Existing Markov Chain Monte Carlo (MCMC) algorithms for estimating these posterior probabilities are slow in mixing and convergence, especially for large networks. We present a novel Markov blanket resampling (MBR) scheme that intermittently reconstructs the Markov blanket of nodes, thus allowing the sampler to more effectively traverse low-probability regions between local maxima. As we can derive the complementary forward and backward directions of the MBR proposal distribution, the Metropolis-Hastings algorithm can be used to account for any asymmetries in these proposals. Experiments across a range of network sizes show that the MBR scheme outperforms other state-of-the-art algorithms, both in terms of learning performance and convergence rate. In particular, MBR achieves better learning performance than the other algorithms when the number of observations is relatively small and faster convergence when the number of variables in the network is large.
In this paper, we present Prometheus, a graph partitioning based algorithm that creates multiple variable decompositions efficiently for learning Sum-Product Network structures across both continuous and discrete doma...
详细信息
In this paper, we present Prometheus, a graph partitioning based algorithm that creates multiple variable decompositions efficiently for learning Sum-Product Network structures across both continuous and discrete domains. Prometheus proceeds by creating multiple candidate decompositions that are represented compactly with an acyclic directed graph in which common parts of different decompositions are shared. It eliminates the correlation threshold hyperparameter often used in other structure learning techniques, allowing Prometheus to learn structures that are robust in low data regimes. Prometheus outperforms other structure learning techniques in 30 discrete and continuous domains. We also extend Prometheus to exploit sparsity in correlations between features in order to obtain an efficient sub-quadratic algorithm (w.r.t. the number of features) that scales better to high dimensional datasets. (C) 2020 The Authors. Published by Elsevier Inc.
暂无评论