On the basis of the observation that conserved positions in transcription factor binding sites are often clustered together, we propose a simple extension to the model-based motif discovery methods. We assign position...
详细信息
On the basis of the observation that conserved positions in transcription factor binding sites are often clustered together, we propose a simple extension to the model-based motif discovery methods. We assign position-specific prior distributions to the frequency parameters of the model, penalizing deviations from a specified conservation profile. Examples with both simulated and real data show that this extension helps discover motifs as the data become noisier or when there is a competing false motif.
In this paper, a noise adaptive speech recognition approach is proposed for recognizing speech which is corrupted by additive non-stationary background noise. The approach sequentially estimates noise parameters, thro...
详细信息
In this paper, a noise adaptive speech recognition approach is proposed for recognizing speech which is corrupted by additive non-stationary background noise. The approach sequentially estimates noise parameters, through which a nonlinear parametric function adapts mean vectors of acoustic models. In the estimation process, posterior probability of state sequence given observation sequence and the previously estimated noise parameter sequence is approximated by the normalized joint likelihood of active partial paths and observation sequence given the previously estimated noise parameter sequence. The Viterbi process provides the normalized joint-likelihood. The acoustic models are not required to be trained from clean speech and they can be trained from noisy speech. The approach can be applied to perform continuous speech recognition in presence of non-stationary noise. Experiments conducted on speech contaminated by simulated and real non-stationary noise show that when acoustic models are trained from clean speech, the noise adaptive speech recognition system provides improvements in word accuracy as compared to the normal noise compensation system (which assumes the noise to be stationary) in slowly time-varying noise. When the acoustic models are trained from noisy speech, the noise adaptive speech recognition system is found to be helpful to get improved performance in slowly time-varying noise over a system employing multi-conditional training. (C) 2003 Elsevier B.V. All rights reserved.
In this paper, a noise adaptive speech recognition approach is proposed for recognizing speech which is corrupted by additive non-stationary background noise. The approach sequentially estimates noise parameters, thro...
详细信息
In this paper, a noise adaptive speech recognition approach is proposed for recognizing speech which is corrupted by additive non-stationary background noise. The approach sequentially estimates noise parameters, through which a nonlinear parametric function adapts mean vectors of acoustic models. In the estimation process, posterior probability of state sequence given observation sequence and the previously estimated noise parameter sequence is approximated by the normalized joint likelihood of active partial paths and observation sequence given the previously estimated noise parameter sequence. The Viterbi process provides the normalized joint-likelihood. The acoustic models are not required to be trained from clean speech and they can be trained from noisy speech. The approach can be applied to perform continuous speech recognition in presence of non-stationary noise. Experiments conducted on speech contaminated by simulated and real non-stationary noise show that when acoustic models are trained from clean speech, the noise adaptive speech recognition system provides improvements in word accuracy as compared to the normal noise compensation system (which assumes the noise to be stationary) in slowly time-varying noise. When the acoustic models are trained from noisy speech, the noise adaptive speech recognition system is found to be helpful to get improved performance in slowly time-varying noise over a system employing multi-conditional training. (C) 2003 Elsevier B.V. All rights reserved.
The problem of image formation for X-ray transmission tomography is formulated as a statistical inverse problem. The maximum likelihood estimate of the attenuation function is sought. Using convex optimization methods...
详细信息
ISBN:
(纸本)0819452025
The problem of image formation for X-ray transmission tomography is formulated as a statistical inverse problem. The maximum likelihood estimate of the attenuation function is sought. Using convex optimization methods, maximizing the log-likelihood functional is equivalent to a double minimization of I-divergence, one of the minimizations being over the attenuation function. Restricting the minimization over the attenuation function to a coarse grid component forms the basis for a multigrid algorithm that is guaranteed to monotonically decrease the I-divergence at every iteration on every scale.
Background: The main goal in analyzing microarray data is to determine the genes that are differentially expressed across two types of tissue samples or samples obtained under two experimental conditions. Mixture mode...
详细信息
Background: The main goal in analyzing microarray data is to determine the genes that are differentially expressed across two types of tissue samples or samples obtained under two experimental conditions. Mixture model method (MMM hereafter) is a nonparametric statistical method often used for microarray processing applications, but is known to over-fit the data if the number of replicates is small. In addition, the results of the MMM may not be repeatable when dealing with a small number of replicates. In this paper, we propose a new version of MMM to ensure the repeatability of the results in different runs, and reduce the sensitivity of the results on the parameters. Results: The proposed technique is applied to the two different data sets: Leukaemia data set and a data set that examines the effects of low phosphate diet on regular and Hyp mice. In each study, the proposed algorithm successfully selects genes closely related to the disease state that are verified by biological information. Conclusion: The results indicate 100% repeatability in all runs, and exhibit very little sensitivity on the choice of parameters. In addition, the evaluation of the applied method on the Leukaemia data set shows 12% improvement compared to the MMM in detecting the biologically-identified 50 expressed genes by Thomas et al. The results witness to the successful performance of the proposed algorithm in quantitative pathogenesis of diseases and comparative evaluation of treatment methods.
Escherichia coli (E. coli) K12 was sequenced in 1997. The 4,639,221-base pair DNA sequence consists of 4288 annotated protein-coding genes, 38 percent of which have no attrib- uted function. One of the major problems ...
详细信息
Escherichia coli (E. coli) K12 was sequenced in 1997. The 4,639,221-base pair DNA sequence consists of 4288 annotated protein-coding genes, 38 percent of which have no attrib- uted function. One of the major problems in predicting prokaryotic promoters is locating the spacers between the -35 box and -10 box and between the -10 box and transcription start site. In this paper, we use the adopted expectationmaximization (EM) algorithm to accurately find the localizations of the promoter regions. A brand new purine-pyrimidine encoding method is pro- posed to reduce the dimensions of the training data. The heavy demand on systems for both computation and memory space can then be avoided through the choice of coding factor. The most representative features are used for training learning vector quantization networks. The simulation results of the proposed coding approach reveal that the precision of promoter predic- tion using the proposed approach is approximately the same as the precision using the traditional encoding method.
In this paper, we provide a novel iterative identification algorithm for multi-rate sampled data systems. The procedure involves, as a first step, identifying a simple initial model from multi-rate data. Based on this...
详细信息
In this paper, we provide a novel iterative identification algorithm for multi-rate sampled data systems. The procedure involves, as a first step, identifying a simple initial model from multi-rate data. Based on this model, the "missing" data points in the slow sampled measurements are estimated following the expectationmaximization approach. Using the estimated missing data points and the original data set, a new model is obtained and this procedure is repeated until the models converge. An attractive feature of the proposed method lies in its applicability to irregularly sampled data. An application of the proposed method to an industrial data set is also included.
This paper demonstrates how the EM algorithm can be used for learning and matching mixtures of point distribution models. We make two contributions. First, we show how shape-classes can be learned in an unsupervised m...
详细信息
This paper demonstrates how the EM algorithm can be used for learning and matching mixtures of point distribution models. We make two contributions. First, we show how shape-classes can be learned in an unsupervised manner. We present a fast procedure for training point distribution models using the EM algorithm. Rather than estimating the class means and covariance matrices needed to construct the PDM, the method iteratively refines the eigenvectors of the covariance matrix using a gradient ascent technique. Second, we show how recognition by alignment can be realised by fitting a mixture of linear shape deformations. We evaluate the method on the problem of learning the class-structure and recognising Arabic characters. (C) 2003 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
Polymerase chain reaction (PCR)-based tests for various microorganisms or target DNA sequences are generally acknowledged to be highly "sensitive," yet the concept of sensitivity is ill-defined in the litera...
详细信息
Polymerase chain reaction (PCR)-based tests for various microorganisms or target DNA sequences are generally acknowledged to be highly "sensitive," yet the concept of sensitivity is ill-defined in the literature on these tests. We propose that sensitivity should be expressed as a function of the number of target DNA molecules in the sample (or specificity, when the target number is 0). However, estimating this "sensitivity curve" is problematic, since it is difficult to construct samples with a fixed number of targets. Nonetheless, using serially diluted replicate aliquots of a known concentration of the target DNA sequence, we show that it is possible to disentangle random variations in the number of target DNA molecules from the underlying test sensitivity. We develop parametric, nonparametric, and semiparametric (spline-based) models for the sensitivity curve. The methods are compared on a new test for M. genitalium.
Change detection is a key topic in land use/land cover related studies and significant efforts have been made in the development of methods for change detection. In this article a multivariate analysis method based on...
详细信息
ISBN:
(纸本)081944684X
Change detection is a key topic in land use/land cover related studies and significant efforts have been made in the development of methods for change detection. In this article a multivariate analysis method based on canonical transformation is introduced into change detection using multi-temporal remote sensing imageries. Afterwards an automatic unsupervised discriminating technique based on the Bayes-Rule of Minimum Error is employed for changed areas identification in the difference image. Experimental results of a case study using Landsat TM imageries are presented to demonstrate the effectiveness of our method.
暂无评论