We generalize the approach of Liu and Lawrence (1999) for multiple changepoint problems where the number of changepoints is unknown. The approach is based on dynamic programming recursion for efficient calculation of ...
详细信息
We generalize the approach of Liu and Lawrence (1999) for multiple changepoint problems where the number of changepoints is unknown. The approach is based on dynamic programming recursion for efficient calculation of the marginal distribution of the data with the hidden parameters integrated out. For the estimation of the hyperparameters, we propose to use Monte Carlo EM when training data are available. The samples from the posterior obtained by our algorithm are independent, getting rid of the convergence issue associated with the MCMC approach. We illustrate our approach on limited simulations and some real data set.
A joint source-channel multiple description (JSC-MD) framework for signal estimation and communication in resource-constrained lossy networks is presented. To keep the encoder complexity at a minimum, a signal is code...
详细信息
A joint source-channel multiple description (JSC-MD) framework for signal estimation and communication in resource-constrained lossy networks is presented. To keep the encoder complexity at a minimum, a signal is coded by a multiple description quantizer (MDQ) with neither entropy nor channel coding. The code diversity of MDQ and the path diversity of the network are exploited by decoders to combat transmission errors. A key design objective is resource scalability: powerful nodes in the network can perform JSC-MD estimation under the criteria of maximum a posteriori probability (MAP) or minimum mean-square error (MMSE), while primitive nodes resort to simpler MD decoding, all working with the same MDQ code. The application of JSC-MD to distributed estimation of hidden Markov models in a sensor network is demonstrated. The proposed JSC-MD MAP estimator is an algorithm of the longest path in a weighted directed acyclic graph, while the JSC-MD MMSE decoder is an extension of the well-known forward-backward algorithm to multiple descriptions. Both algorithms simultaneously exploit the source memory, the redundancy of the fixed-rate MDQ and the inter-description correlations. They outperform the existing hard-decision MDQ decoders by large margins (up to 8 dB). For Gaussian Markov sources, the complexity of JSC-MD distributed MAP sequence estimation can be made as low as that of typical single description Viterbi-type algorithms.
Microarrays have been developed that the the entire nonrepetitive genomes of many different organisms, allowing for the unbiased mapping of active transcription regions or protein binding sites across the entire genom...
详细信息
Microarrays have been developed that the the entire nonrepetitive genomes of many different organisms, allowing for the unbiased mapping of active transcription regions or protein binding sites across the entire genome. These tiling array experiments produce massive correlated data sets that have many experimental artifacts, presenting many challenges to researchers that require innovative analysis methods and efficient computational algorithms. This paper presents a doubly stochastic latent variable analysis method for transcript discovery and protein binding region localization using tiling array data. This model is unique in that it considers actual genomic distance between probes. Additionally, the model is designed to be robust to cross-hybridized and nonresponsive probes, which can often lead to false-positive results in microarray experiments. We apply our model to a transcript finding data set to illustrate the consistency of our method. Additionally, we apply our method to a spike-in experiment that can be used as a benchmark data set for researchers interested in developing and comparing future tiling array methods. The results indicate that our method is very powerful, accurate and can be used on a single sample and without control experiments, thus defraying some of the overhead cost of conducting experiments on tiling arrays.
Our paper presents a new approach for the recognition of highlights in soccer video. Our contribution consists of the combination of Bayesian theorem inferences and Hidden Markov Models (HMMs). We build HMMs to calcul...
详细信息
ISBN:
(纸本)9781424437566
Our paper presents a new approach for the recognition of highlights in soccer video. Our contribution consists of the combination of Bayesian theorem inferences and Hidden Markov Models (HMMs). We build HMMs to calculate probabilities that a test video segment belongs to highlight and non highlight classes. Then, we apply the Bayesian theorem on the two previous probabilities. Our system has achieved an accuracy of 95.6% which is a good result of highlights detection in comparison with other methods.
Biological sequences like DNA or proteins, are always obtained through a sequencing process which might produce some uncertainty. As a result, such sequences are usually written in a degenerated alphabet where some sy...
详细信息
ISBN:
(纸本)9783642040306
Biological sequences like DNA or proteins, are always obtained through a sequencing process which might produce some uncertainty. As a result, such sequences are usually written in a degenerated alphabet where some symbols may correspond to several possible letters (ex: IUPAC DNA alphabet). When counting patterns in such degenerated sequences, the question that naturally arises is: how to deal with degenerated positions ? Since most (usually 99%) of the positions are riot degenerated, it is considered harmless to discard the degenerated positions in order to get an observation, but the exact consequences of such a practice are unclear. In this paper, we introduce a rigorous method to take into account the uncertainty of sequencing for biological sequences (DNA, Proteins). We first, introduce a forward-backward approach to compute the marginal distribution of the constrained sequence and use it both to pet-form a Expectation-Maximization estimation of parameters, as well as deriving a heterogeneous Markov distribution for tire constrained sequence. This distribution is hence used along with known DFA-based pattern approaches to obtain the exact distribution of the pattern count under the constraints. As art illustration, we consider a EST dataset from the EMBL database. Despite the fact that only 1% of the positions in this dataset, are degenerated, we show that riot taking into account, these positions might lead to erroneous observations, further proving the interest of our approach.
In this paper. we propose all improvement of hidden semi-Markov model (HSMM) based speech synthesis system by duration- dependent state transition probabilities. In traditional HMM algorithm, the probability of the du...
详细信息
ISBN:
(纸本)9783642015120
In this paper. we propose all improvement of hidden semi-Markov model (HSMM) based speech synthesis system by duration- dependent state transition probabilities. In traditional HMM algorithm, the probability of the duration of a state decreases exponentially with time, which does not provide,in adequate representation of the temporal Structure of speech. To overcome this limitation, HSMM, which models explicitly the state duration distribution, was proposed. However, there is still in inconsistency. Although HSMM has explicit State duration probability distributions. the state transition probabilities are duration-invariant. In this paper, we introduce duration-dependent state transition probabilities, which are able to characterize the timescale distortion at particular instant of an utterance more effectively. into HSMM based speech synthesis system. Correspondingly we improve forward-backward algorithm and re-derive parameter re-estimation formulae. Experimental results show that the proposed method improves the naturalness of the synthesized speech.
We introduce a number of novel techniques to lexical substitution, including an application of the forward-backward algorithm, a grammatical relation based similarity measure, and a modified form of n-gram matching. W...
详细信息
ISBN:
(纸本)9789544520106
We introduce a number of novel techniques to lexical substitution, including an application of the forward-backward algorithm, a grammatical relation based similarity measure, and a modified form of n-gram matching. We test these techniques on the Semeval-2007 lexical substitution data [McCarthy and Navigli, 2007], to demonstrate their competitive performance. We create a similar (small scale) dataset for Czech, and our evaluation demonstrates language independence of the techniques.
We present novel wavelet-based inpainting algorithms. Applying ideas from anisotropic regularization and diffusion, our models can better handle degraded pixels at edges. We interpret our algorithms within the framewo...
详细信息
We present novel wavelet-based inpainting algorithms. Applying ideas from anisotropic regularization and diffusion, our models can better handle degraded pixels at edges. We interpret our algorithms within the framework of forward-backward splitting methods in convex analysis and prove that the conditions for ensuring their convergence are fulfilled. Numerical examples illustrate the good performance of our algorithms.
We consider analysis of complex stochastic models based upon partial information. MCMC and reversible jump MCMC are often the methods of choice for such problems, but in some situations they can be difficult to implem...
详细信息
We consider analysis of complex stochastic models based upon partial information. MCMC and reversible jump MCMC are often the methods of choice for such problems, but in some situations they can be difficult to implement;and suffer from problems such as poor mixing, and the difficulty of diagnosing convergence. Here we review three alternatives to MCMC methods: importance sampling, the forward-backward algorithm, and sequential Monte Carlo (SMC). We discuss how to design good proposal densities for importance sampling, show some of the range of models for which the forward-backward algorithm can be applied, and show how resampling ideas from SMC can be used to improve the efficiency of the other two methods. We demonstrate these methods on a range of examples, including estimating the transition density of a diffusion and of a discrete-state continuous-time Markov chain;inferring structure in population genetics;and segmenting genetic divergence data.
In this article we introduce two procedures for variable selection in cluster analysis and classification rules. One is mainly aimed at detecting the ''noisy'' noninformative variables, while the other...
详细信息
In this article we introduce two procedures for variable selection in cluster analysis and classification rules. One is mainly aimed at detecting the ''noisy'' noninformative variables, while the other also deals with multicolinearity and general dependence. Both methods are designed to be used after a ''satisfactory'' grouping procedure has been carried out. A forward-backward algorithm is proposed to make such procedures feasible in large datasets. A small simulation is performed and some real data examples are analyzed.
暂无评论