probabilistic programming offers a concise way to represent stochastic models and perform automated statistical inference. However, many real-world models have discrete or hybrid discrete-continuous distributions, for...
详细信息
ISBN:
(数字)9783030449148
ISBN:
(纸本)9783030449148;9783030449131
probabilistic programming offers a concise way to represent stochastic models and perform automated statistical inference. However, many real-world models have discrete or hybrid discrete-continuous distributions, for which existing tools may suffer non-trivial limitations. Inference and parameter estimation can be exceedingly slow for these models because many inference algorithms compute results faster (or exclusively) when the distributions being inferred are continuous. To address this discrepancy, this paper presents Leios. Leios is the first approach for systematically approximating arbitrary probabilistic programs that have discrete, or hybrid discrete-continuous random variables. The approximate programs have all their variables fully continualized. We show that once we have the fully continuous approximate program, we can perform inference and parameter estimation faster by exploiting the existing support that many languages offer for continuous distributions. Furthermore, we show that the estimates obtained when performing inference and parameter estimation on the continuous approximation are still comparably close to both the true parameter values and the estimates obtained when performing inference on the original model.
This paper presents McNetKAT, a scalable tool for verifying probabilistic network programs. McNetKAT is based on a new semantics for the guarded and history-free fragment of probabilistic NetKAT in terms of finite-sta...
详细信息
ISBN:
(纸本)9781450367127
This paper presents McNetKAT, a scalable tool for verifying probabilistic network programs. McNetKAT is based on a new semantics for the guarded and history-free fragment of probabilistic NetKAT in terms of finite-state, absorbing Markov chains. This view allows the semantics of all programs to be computed exactly, enabling construction of an automatic verification tool. Domain-specific optimizations and a parallelizing backend enable McNetKAT to analyze networks with thousands of nodes, automatically reasoning about general properties such as probabilistic program equivalence and refinement, as well as networking properties such as resilience to failures. We evaluate McNetKAT's scalability using real-world topologies, compare its performance against state-of-the-art tools, and develop an extended case study on a recently proposed data center network design.
probabilistic models (PMs) are ubiquitously used across a variety of machine learning applications. They have been shown to successfully integrate structural prior information about data and effectively quantify uncer...
详细信息
ISBN:
(纸本)9781450362405
probabilistic models (PMs) are ubiquitously used across a variety of machine learning applications. They have been shown to successfully integrate structural prior information about data and effectively quantify uncertainty to enable the development of more powerful, interpretable, and efficient learning algorithms. This paper presents AcMC2, a compiler that transforms PMs into optimized hardware accelerators (for use in FPGAs or ASICs) that utilize Markov chain Monte Carlo methods to infer and query a distribution of posterior samples from the model. The compiler analyzes statistical dependencies in the PM to drive several optimizations to maximally exploit the parallelism and data locality available in the problem. We demonstrate the use of AcMC2 to implement several learning and inference tasks on a Xilinx Virtex-7 FPGA. AcMC2-generated accelerators provide a 47 - 100x improvement in runtime performance over a 6-core IBM Power8 CPU and a 8 - 18x improvement over an NVIDIA K80 GPU. This corresponds to a 753 - 1600x improvement over the CPU and 248 - 463x over the GPU in performance-per-watt terms.
We show how to automatically synthesize probabilistic programs from real-world datasets. Such a synthesis is feasible due to a combination of two techniques: (1) We borrow the idea of "sketching" from synthe...
详细信息
ISBN:
(纸本)9781450334686
We show how to automatically synthesize probabilistic programs from real-world datasets. Such a synthesis is feasible due to a combination of two techniques: (1) We borrow the idea of "sketching" from synthesis of deterministic programs, and allow the programmer to write a skeleton program with "holes". Sketches enable the programmer to communicate domain-specific intuition about the structure of the desired program and prune the search space, and (2) we design an efficient Markov Chain Monte Carlo (MCMC) based synthesis algorithm to instantiate the holes in the sketch with program fragments. Our algorithm efficiently synthesizes a probabilistic program that is most consistent with the data. A core difficulty in synthesizing probabilistic programs is computing the likelihood L (P vertical bar D) of a candidate program P generating data D. We propose an approximate method to compute likelihoods using mixtures of Gaussian distributions, thereby avoiding expensive computation of integrals. The use of such approximations enables us to speed up evaluation of the likelihood of candidate programs by a factor of 1000, and makes Markov Chain Monte Carlo based search feasible. We have implemented our algorithm in a tool called PSKETCH, and our results are encouraging-PSKETCH is able to automatically synthesize 16 non-trivial real-world probabilistic programs.
Gliomas are lethal type of central nervous system tumors with a poor prognosis. Recently, with the advancements in the micro-array technologies thousands of gene expression related data of glioma patients are acquired...
详细信息
ISBN:
(纸本)9781728195742
Gliomas are lethal type of central nervous system tumors with a poor prognosis. Recently, with the advancements in the micro-array technologies thousands of gene expression related data of glioma patients are acquired, leading for salient analysis in many aspects. Thus, genomics are been emerged into the field of prognosis analysis. In this work, we identify survival related 7 gene signature and explore two approaches for survival prediction and risk estimation. For survival prediction, we propose a novel probabilistic programming based approach, which outperforms the existing traditional machine learning algorithms. An average 4 fold accuracy of 74% is obtained with the proposed algorithm. Further, we construct a prognostic risk model for risk estimation of glioma patients. This model reflects the survival of glioma patients, with high risk for low survival patients.
In our increasingly connected world, data comes from many different sources, in many different forms, and is noisy, complex, and structured. To confront modern data, we need to embrace the structure inherent in the da...
详细信息
In our increasingly connected world, data comes from many different sources, in many different forms, and is noisy, complex, and structured. To confront modern data, we need to embrace the structure inherent in the data and in the predictions. Statistical relational learning (SRL) is a subfield of machine learning that provides an effective means of approaching this problem of structured prediction. SRL frameworks use weighted logical and arithmetic expressions to easily create probabilistic graphical models (PGMs) to jointly reason over interdependent data. However, despite being well suited for modern, interconnected data, SRL faces several challenges that keep it from becoming practical and widely used in the machine learning community. In this dissertation, I address four pillars of practicality for SRL systems: scalability, expressivity, model adaptability, and usability. My work in this dissertation uses and extends probabilistic Soft Logic (PSL), a state-of-the-art open-source SRL framework. Scalability in SRL systems is essential for using large datasets and complex models. Because of the complex nature of interconnected data, models can easily outgrow available memory spaces. To address scalability for SRL, I developed methods that more efficiently and intelligently instantiate PGMs from templates and data. I also developed fixed-memory inference methods that can perform inference on very large models without requiring a proportional amount of memory. Expressivity allows SRL systems to represent many different problems and data patterns. Because SRL uses logical and arithmetic expressions to represent structured dependencies, SRL frameworks need to be able to express more than just what is represented by feature vectors. To address expressivity for SRL, I created a system to incorporate neural models with structured SRL inference, and expanded the interpretation of PSL weight hyperparameters to include additional types of distributions. Model adaptability i
This work presents PγωNK, a functional probabilistic network programming language that extends probabilistic NetKAT (PNK). Like PNK, it enables probabilistic modelling of network behaviour, by providing probabilisti...
详细信息
This paper proposed optimization-based decision-making support for solving the planning problems of raw material/product order allocation. A few parameters (prices, demand values, defective product rate, and late deli...
详细信息
ISBN:
(纸本)9786236264201;9786236264195
This paper proposed optimization-based decision-making support for solving the planning problems of raw material/product order allocation. A few parameters (prices, demand values, defective product rate, and late delivery) are uncertain and are treated as probabilistic or fuzzy depending on the data availability. Meanwhile, the parameters with historical/trial data are treated as probabilistic with some distribution functions. However, the parameters without any data are treated as fuzzy, and their corresponding membership functions are built by managers based on intuition and experience. Therefore, this study aims to determine optimal values for the decision variables, namely the number of raw materials planned to be ordered and its corresponding suppliers such that the total operational cost is expected to be minimal. These optimal decisions are calculated from the proposed optimization model in LINGO software by implementing the generalized Gradient algorithm. To evaluate and illustrate the proposed decision-making support, a numerical simulation was demonstrated. The results showed the optimal decisions were successfully attained and the expected minimal total operational cost was achieved. Furthermore, it proved that the proposed decision-making support could be implemented in manufacturing or retail industries to solve their order allocation problems.
Inference algorithms for probabilistic programming are complex imperative programs with many moving parts. Efficient inference often requires customising an algorithm to a particular probabilistic model or problem, so...
详细信息
ISBN:
(纸本)9798400702983
Inference algorithms for probabilistic programming are complex imperative programs with many moving parts. Efficient inference often requires customising an algorithm to a particular probabilistic model or problem, sometimes called inference programming. Most inference frameworks are implemented in languages that lack a disciplined approach to side effects, which can result in monolithic implementations where the structure of the algorithms is obscured and inference programming is hard. Functional programming with typed effects offers a more structured and modular foundation for programmable inference, with monad transformers being the primary structuring mechanism explored to date. This paper presents an alternative approach to inference programming based on algebraic effects. Using effect signatures to specify the key operations of the algorithms, and effect handlers to modularly interpret those operations for specific variants, we develop two abstract algorithms, or inference patterns, representing two important classes of inference: Metropolis-Hastings and particle filtering. We show how our approach reveals the algorithms' high-level structure, and makes it easy to tailor and recombine their parts into new variants. We implement the two inference patterns as a Haskell library, and discuss the pros and cons of algebraic effects vis-a-vis monad transformers as a structuring mechanism for modular imperative algorithm design.
We present lambda PSI, the first probabilistic programming language and system that supports higher-order exact inference for probabilistic programs with first-class functions, nested inference and discrete, continuou...
详细信息
ISBN:
(纸本)9781450376136
We present lambda PSI, the first probabilistic programming language and system that supports higher-order exact inference for probabilistic programs with first-class functions, nested inference and discrete, continuous and mixed random variables. lambda PSI's solver is based on symbolic reasoning and computes the exact distribution represented by a program. We show that lambda PSI is practically effective-it automatically computes exact distributions for a number of interesting applications, from rational agents to information theory, many of which could so far only be handled approximately.
暂无评论