Recent algorithms designed for reinforcement learning tasks focus on finding a single optimal solution. However, in many practical applications, it is important to develop reasonable agents with diverse strategies. In...
详细信息
ISBN:
(纸本)9781450394321
Recent algorithms designed for reinforcement learning tasks focus on finding a single optimal solution. However, in many practical applications, it is important to develop reasonable agents with diverse strategies. In this paper, we propose Diversity-Guided Policy Optimization, an on-policy framework for discovering multiple strategies for the same task. Our algorithm uses diversity objectives to guide a latent code conditioned policy to learn a set of diverse strategies in a single training procedure. Experimental results show that our method efficiently finds diverse strategies in a wide variety of reinforcement learning tasks. We further show that DGPO has similar performance and achieves a higher diversity score or better sample efficiency compared to other baselines.
Deep societal benefits will spring from advances in data availability and in computational procedures for mining insights and inferences from large data sets. I will describe efforts to harness data for making predict...
详细信息
ISBN:
(纸本)9781450329569
Deep societal benefits will spring from advances in data availability and in computational procedures for mining insights and inferences from large data sets. I will describe efforts to harness data for making predictions and guiding decisions, touching on work in transportation, healthcare, online services, and interactive systems. I will start with efforts to learn and field predictive models that forecast flows of traffic in greater city regions. Moving from the ground to the air, I will discuss fusing data from aircraft to make inferences about atmospheric conditions and using these results to enhance air transport. I will then focus on experiences with building and fielding predictive models in clinical medicine. I will show how inferences about outcomes and interventions can provide insights and guide decision making. Moving beyond data captured by hospitals, I will discuss the promise of transforming anonymized behavioral data drawn from web services into large-scale sensor networks for public health, including efforts to identify adverse effects of medications and to understand illness in populations. I will conclude by describing how we can use machine learning to leverage the complementarity of human and machine intellect to solve challenging problems in science and society.
Assembly-based modeling is a promising approach to broadening the accessibility of 3D modeling. In assembly-based modeling, new models are assembled from shape components extracted from a database. A key challenge in ...
详细信息
ISBN:
(纸本)9781450309431
Assembly-based modeling is a promising approach to broadening the accessibility of 3D modeling. In assembly-based modeling, new models are assembled from shape components extracted from a database. A key challenge in assembly-based modeling is the identification of relevant components to be presented to the user. In this paper, we introduce a probabilistic reasoning approach to this problem. Given a repository of shapes, our approach learns a probabilisticgraphical model that encodes semantic and geometric relationships among shape components. The probabilistic model is used to present components that are semantically and stylistically compatible with the 3D model that is being assembled. Our experiments indicate that the probabilistic model increases the relevance of presented components.
This paper gives a review of the literature on the application of Hidden Markov models in the field of sentiment analysis. This is done in relation to a research project on semantic representation and the use of proba...
详细信息
ISBN:
(数字)9781728151601
ISBN:
(纸本)9781728151618
This paper gives a review of the literature on the application of Hidden Markov models in the field of sentiment analysis. This is done in relation to a research project on semantic representation and the use of probabilistic graphical models for the determination of sentiment in textual data. Relevant articles have been analyzed that correspond mainly to the certain variations of the implementation of HMM and a variety of use cases for the purpose of sentiment classification. Finally, this review presents the grounds for future works that seek to develop techniques for semantic text representations implemented with probabilistic graphical models (Hidden Markov models) or that through a combination scheme allow for superior classification performance.
Bayesian phylogenetic inference aims to estimate the evolutionary relationships among different lineages (species, populations, gene families, viral strains, etc.) in a model-based statistical framework that uses the ...
详细信息
Algorithms for inferring the structure of Bayesian networks from data have become an increasingly popular method for uncovering the direct and indirect influences among variables in complex systems. A Bayesian approac...
详细信息
Algorithms for inferring the structure of Bayesian networks from data have become an increasingly popular method for uncovering the direct and indirect influences among variables in complex systems. A Bayesian approach to structure learning uses posterior probabilities to quantify the strength with which the data and prior knowledge jointly support each possible graph feature. Existing Markov Chain Monte Carlo (MCMC) algorithms for estimating these posterior probabilities are slow in mixing and convergence, especially for large networks. We present a novel Markov blanket resampling (MBR) scheme that intermittently reconstructs the Markov blanket of nodes, thus allowing the sampler to more effectively traverse low-probability regions between local maxima. As we can derive the complementary forward and backward directions of the MBR proposal distribution, the Metropolis-Hastings algorithm can be used to account for any asymmetries in these proposals. Experiments across a range of network sizes show that the MBR scheme outperforms other state-of-the-art algorithms, both in terms of learning performance and convergence rate. In particular, MBR achieves better learning performance than the other algorithms when the number of observations is relatively small and faster convergence when the number of variables in the network is large.
A fundamental challenge in developing high-impact machine learning technologies is balancing the need to model rich, structured domains with the ability to scale to big data. Many important problem areas are both rich...
详细信息
A fundamental challenge in developing high-impact machine learning technologies is balancing the need to model rich, structured domains with the ability to scale to big data. Many important problem areas are both richly structured and large scale, from social and biological networks, to knowledge graphs and the Web, to images, video, and natural language. In this paper, we introduce two new formalisms for modeling structured data, and show that they can both capture rich structure and scale to big data. The first, hingeloss Markov random fields (HL-MRFs), is a new kind of probabilisticgraphical model that generalizes different approaches to convex inference. We unite three approaches from the randomized algorithms, probabilistic graphical models, and fuzzy logic communities, showing that all three lead to the same inference objective. We then define HL-MRFs by generalizing this unified objective. The second new formalism, probabilistic soft logic (PSL), is a probabilistic programming language that makes HL-MRFs easy to define using a syntax based on first-order logic. We introduce an algorithm for inferring most-probable variable assignments (MAP inference) that is much more scalable than general-purpose convex optimization methods, because it uses message passing to take advantage of sparse dependency structures. We then show how to learn the parameters of HL-MRFs. The learned HL-MRFs are as accurate as analogous discrete models, but much more scalable. Together, these algorithms enable HL-MRFs and PSL to model rich, structured data at scales not previously possible.
The Libra Toolkit is a collection of algorithms for learning and inference with discrete probabilisticmodels, including Bayesian networks, Markov networks, dependency networks, and sum-product networks. Compared to o...
详细信息
The Libra Toolkit is a collection of algorithms for learning and inference with discrete probabilisticmodels, including Bayesian networks, Markov networks, dependency networks, and sum-product networks. Compared to other toolkits, Libra places a greater emphasis on learning the structure of tractable models in which exact inference is efficient. It also includes a variety of algorithms for learning graphicalmodels in which inference is potentially intractable, and for performing exact and approximate inference. Libra is released under a 2-clause BSD license to encourage broad use in academia and industry.
Improving the quality of care for hip arthroplasty (replacement) patients requires the systematic evaluation of clinical performance of implants and the identification of "outlier" devices that have an espec...
详细信息
Improving the quality of care for hip arthroplasty (replacement) patients requires the systematic evaluation of clinical performance of implants and the identification of "outlier" devices that have an especially high risk of reoperation ("revision"). Postmarket surveillance of arthroplasty implants, which rests on the analysis of large patient registries, has been effective in identifying outlier implants such as the ASR metal-on-metal hip resurfacing device that was recalled. Although identifying an implant as an outlier implies a causal relationship between the implant and revision risk, traditional signal detection methods use classical biostatistical methods. The field of probabilisticgraphical modeling of causal relationships has developed tools for rigorous analysis of causal relationships in observational data. The purpose of this study was to evaluate one causal discovery algorithm (PC) to determine its suitability for hip arthroplasty implant signal detection. Simulated data were generated using distributions of patient and implant characteristics, and causal discovery was performed using the TETRAD software package. Two sizes of registries were simulated: (1) a statewide registry in Michigan and (2) a nationwide registry in the United Kingdom. The results showed that the algorithm performed better for the simulation of a large national registry. The conclusion is that the causal discovery algorithm used in this study may be a useful tool for implant signal detection for large arthroplasty registries;regional registries may only be able to only detect implants that perform especially poorly.
暂无评论