We show that streams and lazy data structures are a natural idiom for programming with infinite-dimensional Bayesian methods such as Poisson processes, Gaussian processes, jump processes, Dirichlet processes, and Beta...
详细信息
We show that streams and lazy data structures are a natural idiom for programming with infinite-dimensional Bayesian methods such as Poisson processes, Gaussian processes, jump processes, Dirichlet processes, and Beta processes. The crucial semantic idea, inspired by developments in synthetic probability theory, is to work with two separate monads: an affine monad of probability, which supports laziness, and a commutative, non-affine monad of measures, which does not. (Affine means that T (1) congruent to 1.) We show that the separation is important from a decidability perspective, and that the recent model of quasi-Borel spaces supports these two monads. To perform Bayesian inference with these examples, we introduce new inference methods that are specially adapted to laziness;they are proven correct by reference to the Metropolis-Hastings-Green method. Our theoretical development is implemented as a Haskell library, LazyPPL.
The ability to connect genetic information between traits over time allow Bayesian networks to offer a powerful probabilistic framework to construct genomic prediction models. In this study, we phenotyped a diversity ...
详细信息
The ability to connect genetic information between traits over time allow Bayesian networks to offer a powerful probabilistic framework to construct genomic prediction models. In this study, we phenotyped a diversity panel of 869 biomass sorghum (Sorghum bicolor (L.) Moench) lines, which had been genotyped with 100,435 SNP markers, for plant height (PH) with biweekly measurements from 30 to 120 days after planting (DAP) and for end-of-season dry biomass yield (DBY) in four environments. We evaluated five genomic prediction models: Bayesian network (BN), Pleiotropic Bayesian network (PBN), Dynamic Bayesian network (DBN), multi-trait GBLUP (MTr-GBLUP), and multi-time GBLUP (MTi-GBLUP) models. In fivefold cross-validation, prediction accuracies ranged from 0.46 (PBN) to 0.49 (MTr-GBLUP) for DBY and from 0.47 (DBN, DAP120) to 0.75 (MTi-GBLUP, DAP60) for PH. Forward-chaining cross-validation further improved prediction accuracies of the DBN, MTi-GBLUP and MTr-GBLUP models for PH (training slice: 30-45 DAP) by 36.4-52.4% relative to the BN and PBN models. Coincidence indices (target: biomass, secondary: PH) and a coincidence index based on lines (PH time series) showed that the ranking of lines by PH changed minimally after 45 DAP. These results suggest a two-level indirect selection method for PH at harvest (first-level target trait) and DBY (second-level target trait) could be conducted earlier in the season based on ranking of lines by PH at 45 DAP (secondary trait). With the advance of high-throughput phenotyping technologies, our proposed two-level indirect selection framework could be valuable for enhancing genetic gain per unit of time when selecting on developmental traits.
The present stage of developments in stochastic programming gives already a good base for real-life applications. The possibility of using alternative models is studied on a small-size but meaningful example connected...
详细信息
The present stage of developments in stochastic programming gives already a good base for real-life applications. The possibility of using alternative models is studied on a small-size but meaningful example connected with water management of a real-life water resource system in Eastern Czechoslovakia. Both of the considered conceptually different stochastic programming models take into account intercorrelations within a group of random parameters and provide comparable optimal decisions. At the same time, these models are used for comparison of existing numerical procedures for stochastic programming, namely, approximation schemes that result in large-size linear programs, stochastic quasigradient methods and special techniques for handling joint chance constraints.
The Chance Constrained Critical Path (CCCP) generally depends on the preassigned minimum probability level. It is shown that for a wide class of probability distributions there always exists a probability value for wh...
详细信息
The Chance Constrained Critical Path (CCCP) generally depends on the preassigned minimum probability level. It is shown that for a wide class of probability distributions there always exists a probability value for which the CCCP remains unchanged for all probabilities greater than or equal to that value. This probability value is easily obtained from an optimal solution of a simple network problem. In addition, necessary and sufficient conditions for the CCCP to be unconditionally independent of the minimum probability level are given for that class of probability distributions.
In this paper, we address the following probabilistic version (PSC) of the set covering problem: min{cx vertical bar P(Ax >= xi) >= p, x is an element of {0, 1}(N)} where A is a 0-1 matrix, xi is a random 0-1 ve...
详细信息
In this paper, we address the following probabilistic version (PSC) of the set covering problem: min{cx vertical bar P(Ax >= xi) >= p, x is an element of {0, 1}(N)} where A is a 0-1 matrix, xi is a random 0-1 vector and p is an element of (0, 1] is the threshold probability level. We introduce the concepts of p-inefficiency and polarity cuts. While the former is aimed at deriving an equivalent MIP reformulation of (PSC), the latter is used as a strengthening device to obtain a stronger formulation. Simplifications of the MIP model which result when one of the following conditions hold are briefly discussed: A is a balanced matrix, A has the circular ones property, the components of xi are pairwise independent, the distribution function of xi is a stationary distribution or has the disjunctive shattering property. We corroborate our theoretical findings by an extensive computational experiment on a test-bed consisting of almost 10,000 probabilistic instances. This test-bed was created using deterministic instances from the literature and consists of probabilistic variants of the set covering model and capacitated versions of facility location, warehouse location and k-median models. Our computational results show that our procedure is orders of magnitude faster than any of the existing approaches to solve (PSC), and in many cases can reduce hours of computing time to a fraction of a second.
Ignorance, inconsistency, nonsense and similar phenomena are omnipresent in everyday reasoning. They have been intensively studied, especially in the area of multiple-valued logics. Therefore we develop a framework fo...
详细信息
Ignorance, inconsistency, nonsense and similar phenomena are omnipresent in everyday reasoning. They have been intensively studied, especially in the area of multiple-valued logics. Therefore we develop a framework for belief bases, combining multiple-valued and probabilistic reasoning, with the main focus on the way belief bases are actually used and accessed through queries. As an implementation tool we use a probabilistic programming language PROBLOG. Though based on distribution semantics with the independence assumption, we show how its constructs can successfully be used in implementing the considered logics and belief bases. In particular, we develop a technique for shifting probabilistic dependencies to the level of symbolic parts of belief bases. We also discuss applications of the framework in reasoning with Likert-type scales, widely exploited in questionnaire-based experimental research in psychology, economics, sociology, politics, public opinion measurements, and related areas.
We have formulated the ab-initio prediction of the 3D-structure of proteins as a probabilistic programming problem where the inter-residue 3D-distances are treated as random variables. Lower and upper bounds for these...
详细信息
We have formulated the ab-initio prediction of the 3D-structure of proteins as a probabilistic programming problem where the inter-residue 3D-distances are treated as random variables. Lower and upper bounds for these random variables and the corresponding probabilities are estimated by nonparametric statistical methods and knowledge-based heuristics. In this paper we focus on the probabilistic computation of the 3D-structure using these distance interval estimates. Validation of the predicted structures shows our method to be more accurate than other computational methods reported so far. Our method is also found to be computationally more efficient than other existing ab-initio structure prediction methods. Moreover, we provide a reliability index for the predicted structures too. Because of its computational simplicity and its applicability to any random sequence, our algorithm called PROPAINOR (PROtein structure Prediction by AI an Nonparametric Regression) has significant scope in computational protein structural genomics. (C) 2002 Published by Elsevier Science Ltd.
This paper draws together four perspectives that contribute to a new understanding of probability and solving problems involving probability. The first is the Subjective Bayesian perspective that probability is affect...
详细信息
This paper draws together four perspectives that contribute to a new understanding of probability and solving problems involving probability. The first is the Subjective Bayesian perspective that probability is affected by one's knowledge, and that it is updated as one's knowledge changes. The main criticism of the Bayesian perspective is the problem of assigning prior probabilities;this problem disappears with our Information Theory perspective, in which we take the bold new step of equating probability with information. The main point of the paper is that the formal perspective (formalize, calculate, unformalize) is beneficial to solving probability problems. And finally, the programmer's perspective provides us with a suitable formalism. To illustrate the benefits of these perspectives, we completely solve the hitherto open problem of the two envelopes.
We present a static analysis for discovering differentiable or more generally smooth parts of a given probabilistic program, and show how the analysis can be used to improve the pathwise gradient estimator, one of the...
详细信息
We present a static analysis for discovering differentiable or more generally smooth parts of a given probabilistic program, and show how the analysis can be used to improve the pathwise gradient estimator, one of the most popular methods for posterior inference and model learning. Our improvement increases the scope of the estimator from differentiable models to non-differentiable ones without requiring manual intervention of the user;the improved estimator automatically identifies differentiable parts of a given probabilistic program using our static analysis, and applies the pathwise gradient estimator to the identified parts while using a more general but less efficient estimator, called score estimator, for the rest of the program. Our analysis has a surprisingly subtle soundness argument, partly due to the misbehaviours of some target smoothness properties when viewed from the perspective of program analysis designers. For instance, some smoothness properties, such as partial differentiability and partial continuity, are not preserved by function composition, and this makes it difficult to analyse sequential composition soundly without heavily sacrificing precision. We formulate five assumptions on a target smoothness property, prove the soundness of our analysis under those assumptions, and showthat our leading examples satisfy these assumptions. We also showthat by using information from our analysis instantiated for differentiability, our improved gradient estimator satisfies an important differentiability requirement and thus computes the correct estimate on average (i.e., returns an unbiased estimate) under a regularity condition. Our experiments with representative probabilistic programs in the Pyro language show that our static analysis is capable of identifying smooth parts of those programs accurately, and making our improved pathwise gradient estimator exploit all the opportunities for high performance in those programs.
We describe a method to perform functional operations on probability distributions of random variables. The method uses reproducing kernel Hilbert space representations of probability distributions, and it is applicab...
详细信息
We describe a method to perform functional operations on probability distributions of random variables. The method uses reproducing kernel Hilbert space representations of probability distributions, and it is applicable to all operations which can be applied to points drawn from the respective distributions. We refer to our approach as kernel probabilistic programming. We illustrate it on synthetic data and show how it can be used for nonparametric structural equation models, with an application to causal inference.
暂无评论