We consider a probabilisticgraphical model for the problem of tracking entities moving among a finite set of sites. The observations consist of counts of the number of entities at sites and during movement between si...
详细信息
ISBN:
(纸本)9781509061679
We consider a probabilisticgraphical model for the problem of tracking entities moving among a finite set of sites. The observations consist of counts of the number of entities at sites and during movement between sites. A Bayesian approach is adopted and an importance sampling approach taken to obtaining samples from the model. A backtrack-free proposal distribution is considered and an oracle is obtained through the construction of appropriate network flow problems.
Intelligent agents must be able to handle the complexity and uncertainty of the real world. Logical AI has focused mainly on the former, and statistical AI on the latter. Markov logic combines the two by attaching wei...
详细信息
ISBN:
(纸本)9781450343916
Intelligent agents must be able to handle the complexity and uncertainty of the real world. Logical AI has focused mainly on the former, and statistical AI on the latter. Markov logic combines the two by attaching weights to first-order formulas and viewing them as templates for features of Markov networks. Inference algorithms for Markov logic draw on ideas from satisfiability, Markov chain Monte Carlo and knowledge-based model construction. Learning algorithms are based on the voted perceptron, pseudo-likelihood and inductive logic programming. Markov logic has been successfully applied to a wide variety of problems in natural language understanding, vision, computational biology, social networks and others, and is the basis of the open-source Alchemy system.
We present a context-aware hybrid classification system for the problem of fine-grained product class recognition in computer vision. Recently, retail product recognition has become an interesting computer vision rese...
详细信息
ISBN:
(纸本)9781509019298
We present a context-aware hybrid classification system for the problem of fine-grained product class recognition in computer vision. Recently, retail product recognition has become an interesting computer vision research topic. We focus on the classification of products on shelves in a store. This is a very challenging classification problem because many product classes are visually similar in terms of shape, color, texture, and metric size. In shelves, same or similar products are more likely to appear adjacent to each other and displayed in certain arrangements rather than at random. The arrangement of the products on the shelves has a spatial continuity both in brand and metric size. By using this context information, the co-occurrence of the products and the adjacency relations between the products can be statistically modeled. The proposed hybrid approach improves the accuracy of context-free image classifiers such as Support Vector Machines (SVMs), by combining them with a probabilisticgraphical model such as Hidden Markov models (HMMs) or Conditional Random Fields (CRFs). The fundamental goal of this paper is using contextual relationships in retail shelves to improve the classification accuracy by executing a context-aware approach.
The AMIDST Toolbox is an open source Java 8 library for scalable learning of probabilistic graphical models (PGMs) based on both batch and streaming data. An important application domain with streaming data characteri...
详细信息
ISBN:
(纸本)9781509059102
The AMIDST Toolbox is an open source Java 8 library for scalable learning of probabilistic graphical models (PGMs) based on both batch and streaming data. An important application domain with streaming data characteristics is the banking sector, where we may want to monitor individual customers (based on their financial situation and behavior) as well as the general economic climate. Using a real financial data set from a Spanish bank, we have previously proposed and demonstrated a novel PGM framework for performing this type of data analysis with particular focus on concept drift. The framework is implemented in the AMIDST Toolbox, which was also used to conduct the reported analyses. In this paper, we provide an overview of the toolbox and illustrate with code examples how the toolbox can be used for setting up and performing analyses of this particular type.
With the prevalence of social media, such as Twitter, short-length text like microblogs have become an important mode of text on the Internet. In contrast to other forms of media, such as newspaper, the text in these ...
详细信息
ISBN:
(纸本)9781509059102
With the prevalence of social media, such as Twitter, short-length text like microblogs have become an important mode of text on the Internet. In contrast to other forms of media, such as newspaper, the text in these social media posts usually contains fewer words, and is concentrated on a much narrower selection of topics. For these reasons, traditional LDA-based sentiment and topic modeling techniques generally do not work well in case of social media data. Another characteristic feature of this data is the use of special meta tokens, such as hashtags, which contain unique semantic meanings that are not captured by other ordinary words. In the recent years, many topic modeling techniques have been proposed for social media data, but the majority of this work does not take into account the specialty of tokens, such as hashtags, and treats them as ordinary words. In this paper, we propose probabilistic graphical models to address the problem of discovering latent topics and their sentiment from social media data, mainly microblogs like Twitter. We first propose MTM (Microblog Topic Model), a generative model that assumes each social media post generates from a single topic, and models both words and hashtags separately. We then propose MSTM (Microblog Sentiment Topic Model), an extension of MTM, which also embodies the sentiment associated with the topics. We evaluated our models using Twitter dataset, and experimental
The suitable operation of mobile robots when providing Ambient Assisted Living (AAL) services calls for robust object recognition capabilities. probabilistic graphical models (PGMs) have become the de-facto choice in ...
详细信息
ISBN:
(纸本)9783319487991;9783319487984
The suitable operation of mobile robots when providing Ambient Assisted Living (AAL) services calls for robust object recognition capabilities. probabilistic graphical models (PGMs) have become the de-facto choice in recognition systems aiming to efficiently exploit contextual relations among objects, also dealing with the uncertainty inherent to the robot workspace. However, these models can perform in an incoherent way when operating in a long-term fashion out of the laboratory, e.g. while recognizing objects in peculiar configurations or belonging to new types. In this work we propose a recognition system that resorts to PGMs and common-sense knowledge, represented in the form of an ontology, to detect those inconsistencies and learn from them. The utilization of the ontology carries additional advantages, e.g. the possibility to verbalize the robot's knowledge. A primary demonstration of the system capabilities has been carried out with very promising results.
Event detection with the spatio-temporal correlation is one of the most popular applications of wireless sensor networks. This kind of task trends to be a difficult problem of big data analysis due to the massive data...
详细信息
ISBN:
(数字)9783319320557
ISBN:
(纸本)9783319320557;9783319320540
Event detection with the spatio-temporal correlation is one of the most popular applications of wireless sensor networks. This kind of task trends to be a difficult problem of big data analysis due to the massive data generated from large-scale sensor networks like water sensor networks, especially in the context of real-time analysis. To reduce the computational cost of abnormal event detection and improve the response time, sensor node selection is needed to cut down the amount of data for the spatio-temporal correlation analysis. In this paper, a connected dominated set (CDS) approach is introduced to select backbone nodes from the sensor network. Furthermore, a spatio-temporal model is proposed to achieve the spatio-temporal correlation analysis, where Markov chain is adopted to model the temporal dependency among the different sensor nodes, and Bayesian Network (BN) is used to model the spatial dependency. The proposed approach and model have been applied to the real-time detection of urgent events (e.g. water pollution incidents) with water sensor networks. Preliminary experimental results on simulated data indicate that our solution can achieve better performance in terms of response time and scalability, compared to the simple threshold algorithm and the BN-only algorithm.
We consider the problem of finding the M assignments with the maximum probabilities (or equivalently, the M-best MAP assignments) on a probabilisticgraphical model. The covering graph approximation method provides an...
详细信息
ISBN:
(纸本)9789811030024;9789811030017
We consider the problem of finding the M assignments with the maximum probabilities (or equivalently, the M-best MAP assignments) on a probabilisticgraphical model. The covering graph approximation method provides an upper bound on each of the true M-best MAP costs. However, the tightness of these bounds is closely related to how to split the parameters of the duplicate nodes. We propose a monotonic algorithm to tighten the M-best MAP bounds by finding the optimal splitting of these parameters. Experimental results on synthetic and real problems show that our algorithm provides much tighter bounds than those provided by uniformly splitting the parameters.
Many fundamental problems in natural language processing rely on determining what entities appear in a given text. Commonly referenced as entity linking, this step is a fundamental component of many NLP tasks such as ...
详细信息
ISBN:
(纸本)9781450341431
Many fundamental problems in natural language processing rely on determining what entities appear in a given text. Commonly referenced as entity linking, this step is a fundamental component of many NLP tasks such as text understanding, automatic summarization, semantic search or machine translation. Name ambiguity, word polysemy, context dependencies and a heavy-tailed distribution of entities contribute to the complexity of this problem. We here propose a probabilistic approach that makes use of an effective graphical model to perform collective entity disambiguation. Input mentions (i.e., linkable token spans) are disambiguated jointly across an entire document by combining a document-level prior of entity co-occurrences with local information captured from mentions and their surrounding context. The model is based on simple sufficient statistics extracted from data, thus relying on few parameters to be learned. Our method does not require extensive feature engineering, nor an expensive training procedure. We use loopy belief propagation to perform approximate inference. The low complexity of our model makes this step sufficiently fast for real-time usage. We demonstrate the accuracy of our approach on a wide range of benchmark datasets, showing that it matches, and in many cases outperforms, existing stateof-the-art methods.
In this paper, we will give for the first time a formal mathematical language to the steps used currently by financial institutions when calculating the impact of a stress scenario on a balance sheet that depends on m...
详细信息
In this paper, we will give for the first time a formal mathematical language to the steps used currently by financial institutions when calculating the impact of a stress scenario on a balance sheet that depends on more granular or different factors than those provided in the scenario. We will introduce the language of probabilistic graphical models a technique rooted in machine learning to show how the different models used at each step can be put together in a coherent picture, thus giving a holistic view of the entire model setup. This will give us a solid basis to discuss some weaknesses and problems with the stress-testing exercises run by the industry as of today. We will show empirical analyses to substantiate better some of our claims.
暂无评论