Cancer is a somatic evolutionary process characterized by the accumulation of mutations, which contribute to tumor growth, clinical progression, immune escape, and drug resistance development. Evolutionary theory can ...
详细信息
Cancer is a somatic evolutionary process characterized by the accumulation of mutations, which contribute to tumor growth, clinical progression, immune escape, and drug resistance development. Evolutionary theory can be used to analyze the dynamics of tumor cell populations and to make inference about the evolutionary history of a tumor from molecular data. We review recent approaches to modeling the evolution of cancer, including population dynamics models of tumor initiation and progression, phylogenetic methods to model the evolutionary relationship between tumor subclones, and probabilistic graphical models to describe dependencies among mutations. Evolutionary modeling helps to understand how tumors arise and will also play an increasingly important prognostic role in predicting disease progression and the outcome of medical interventions, such as targeted therapy.
Bayesian networks (BNs) provide a powerful graphical model for encoding the probabilistic relationships among a set of variables, and hence can naturally be used for classification. However, Bayesian network classifie...
详细信息
Bayesian networks (BNs) provide a powerful graphical model for encoding the probabilistic relationships among a set of variables, and hence can naturally be used for classification. However, Bayesian network classifiers (BNCs) learned in the common way using likelihood scores usually tend to achieve only mediocre classification accuracy because these scores are less specific to classification, but rather suit a general inference problem. We propose risk minimization by cross validation (RMCV) using the 0/1 loss function, which is a classification-oriented score for unrestricted BNCs. RMCV is an extension of classification-oriented scores commonly used in learning restricted BNCs and non-BN classifiers. Using small real and synthetic problems, allowing for learning all possible graphs, we empirically demonstrate RMCV superiority to marginal and class-conditional likelihood-based scores with respect to classification accuracy. Experiments using twenty-two real-world datasets show that BNCs learned using an RMCV-based algorithm significantly outperform the naive Bayesian classifier (NBC), tree augmented NBC (TAN), and other BNCs learned using marginal or conditional likelihood scores and are on par with non-BN state of the art classifiers, such as support vector machine, neural network, and classification tree. These experiments also show that an optimized version of RMCV is faster than all unrestricted BNCs and comparable with the neural network with respect to run-time. The main conclusion from our experiments is that unrestricted BNCs, when learned properly, can be a good alternative to restricted BNCs and traditional machine-learning classifiers with respect to both accuracy and efficiency. (C) 2011 Elsevier Inc. All rights reserved.
Team communication modeling offers great potential for adaptive learning environments for team training. However, the complex dynamics of team communication pose significant challenges for team communication modeling....
详细信息
ISBN:
(数字)9783030782924
ISBN:
(纸本)9783030782924;9783030782917
Team communication modeling offers great potential for adaptive learning environments for team training. However, the complex dynamics of team communication pose significant challenges for team communication modeling. To address these challenges, we present a hybrid framework integrating deep learning and probabilistic graphical models that analyzes team communication utterances with respect to the intent of the utterance and the directional flow of communication within the team. The hybrid framework utilizes conditional random fields (CRFs) that use deep learning-based contextual, distributed language representations extracted from team members' utterances. An evaluation with communication data collected from six teams during a live training exercise indicate that linear-chain CRFs utilizing ELMo utterance embeddings (1) outperform both multi-task and single-task variants of stacked bidirectional long short-term memory networks using the same distributed representations of the utterances, (2) outperform a hybrid approach that uses non-contextual utterance representations for the dialogue classification tasks, and (3) demonstrate promising domaintransfer capabilities. The findings suggest that the hybrid multidimensional team communication analysis framework can accurately recognize speaker intent and model the directional flow of team communication to guide adaptivity in team training environments.
Dental panoramic radiographic images are commonly used as biometrics for human identification. In this study, a novel method is presented for identifying humans by matching 2D panoramic dental X-ray images. Each tooth...
详细信息
Dental panoramic radiographic images are commonly used as biometrics for human identification. In this study, a novel method is presented for identifying humans by matching 2D panoramic dental X-ray images. Each tooth is first identified and labeled with support vector machines and probabilistic graphical models. Missing teeth and dental restorations are also detected. Then, matching scores between images are calculated according to tooth-wise appearance and geometric similarities by taking dental restorations into account. The method is tested on a dataset including 206 dental panoramic X-ray images of 170 different subjects. The proposed method has 81% rank-1 accuracy and 89% rank-2 accuracy.
The goal of this research is to strengthen the teaching strategy with quantitatively measured learning analytics. The entropy-based learning analytics aims to measure and understand students' progress by quantitat...
详细信息
ISBN:
(纸本)9783031116476;9783031116469
The goal of this research is to strengthen the teaching strategy with quantitatively measured learning analytics. The entropy-based learning analytics aims to measure and understand students' progress by quantitatively measuring the difference between the content to be learned, the tutors' expectation of understanding, and the student's knowledge. This quantification will take similar steps than taken by Shannon for his information theory using a mathematical formalism to quantitatively measure knowledge (equivalent to Shannon's entropy) and knowledge transfer (equivalent to Shannon's mutual information). Knowledge graphs will be used to represent the content to be learned, the tutors' expectations, and the student's knowledge. Early results reveal that advanced analytical algorithms and graph entropy specified for educational applications is necessary for this research project to succeed.
Credal networks relax the precise probability requirement of Bayesian networks, enabling a richer representation of uncertainty in the form of closed convex sets of probability measures. The increase in expressiveness...
详细信息
Credal networks relax the precise probability requirement of Bayesian networks, enabling a richer representation of uncertainty in the form of closed convex sets of probability measures. The increase in expressiveness comes at the expense of higher computational costs. In this paper, we present a new variable elimination algorithm for exactly computing posterior inferences in extensively specified credal networks, which is empirically shown to outperform a state-of-the-art algorithm. The algorithm is then turned into a provably good approximation scheme, that is, a procedure that for any input is guaranteed to return a solution not worse than the optimum by a given factor. Remarkably, we show that when the networks have bounded treewidth and bounded number of states per variable the approximation algorithm runs in time polynomial in the input size and in the inverse of the error factor, thus being the first known fully polynomial-time approximation scheme for inference in credal networks. (C) 2012 Elsevier Inc. All rights reserved.
This paper introduces a novel recommendation model, namely, the maximum feature dependency semi-naive Bayesian (MDN) model, which is aimed at addressing feature correlation in restaurant selection. Existing restaurant...
详细信息
ISBN:
(纸本)9798350386783;9798350386776
This paper introduces a novel recommendation model, namely, the maximum feature dependency semi-naive Bayesian (MDN) model, which is aimed at addressing feature correlation in restaurant selection. Existing restaurant recommendation methods ignore feature correlation and negative samples to improve recommendation accuracy. In contrast, the proposed recommendation model establishes maximum dependency features for each attribute, using both positive and negative samples to enhance its accuracy. The model first performs logarithmic division transformation on naive Bayesian classification to obtain user behavior patterns that can be sorted. Then, it combines frequent pattern mining to optimize user behavior patterns and obtain a semi-naive Bayesian recommendation model for restaurant recommendations. Experiments on a Yelp dataset show that the restaurant recommendation algorithm based on naive Bayesian user behavior patterns outperforms other basic recommendation methods. The proposed recommendation model stably improves recommendation methods based on naive Bayesian user behavior patterns, whereas other semi-naive Bayesian methods reduce accuracy due to the consideration of feature correlation. The restaurant recommendation method based on user behavior patterns also improves the interpretability of restaurant recommendations.
The combined use of graphicalmodels and probabilistic techniques has been shown to be highly effective in applications involving the uncertain data from industrial environments. In addition, industrial data analysis ...
详细信息
ISBN:
(纸本)9781509060146
The combined use of graphicalmodels and probabilistic techniques has been shown to be highly effective in applications involving the uncertain data from industrial environments. In addition, industrial data analysis has been a necessary and interesting approach for systems in industry 4.0. The present paper proposes a probabilisticgraphical model to infer the probability of the path presenting a packet delivery ratio (PDR) above a threshold. The technique used was Bayesian Network, however selection and discretization techniques were applied prior to data processing. These data were collected froma real WirelessHART network. The results show the applicability of the probabilistics method to real problems of industrial network.
Automated cell tracking methods are still error-prone. On very large data sets, uncertainty measures are thus needed to guide the expert to the most ambiguous events so these can be corrected with minimal effort. We p...
详细信息
ISBN:
(纸本)9781479923748
Automated cell tracking methods are still error-prone. On very large data sets, uncertainty measures are thus needed to guide the expert to the most ambiguous events so these can be corrected with minimal effort. We present two easy-to-use methods to sample multiple proposal solutions from a tracking-by-assignment graphical model and experimentally evaluate the benefits of the uncertainty measures derived. Expert time for proof-reading is reduced greatly compared to random selection of predicted events.
If several friends of Smith have committed petty thefts, what would you say about Smith? Most people would not be surprised if Smith is a hardened criminal. Guilt-by-association methods combine weak signals to derive ...
详细信息
ISBN:
(纸本)9783642237836;9783642237829
If several friends of Smith have committed petty thefts, what would you say about Smith? Most people would not be surprised if Smith is a hardened criminal. Guilt-by-association methods combine weak signals to derive stronger ones, and have been extensively used for anomaly detection and classification in numerous settings (e.g., accounting fraud, cyber-security, calling-card fraud). The focus of this paper is to compare and contrast several very successful, guilt-by-association methods: Random Walk with Restarts, Semi-Supervised Learning, and Belief Propagation (BP). Our main contributions are two-fold: (a) theoretically, we prove that all the methods result in a similar matrix inversion problem;(b) for practical applications, we developed FABP, a fast algorithm that yields 2 x speedup, equal or higher accuracy than BP, and is guaranteed to converge. We demonstrate these benefits using synthetic and real datasets, including YahooWeb, one of the largest graphs ever studied with BP.
暂无评论