Cross-directional control and monitoring of paper machines require a knowledge of cross directional profile of the quality variables. However, sensors used in paper machines follow a zig-zag trajectory providing only ...
详细信息
Cross-directional control and monitoring of paper machines require a knowledge of cross directional profile of the quality variables. However, sensors used in paper machines follow a zig-zag trajectory providing only data that is a combination of cross and machine direction variations. In this paper, we propose a new model-based approach to estimate the complete cross directional profile. This method is based on a modification of expectationmaximization approach for a new model proposed in this paper. In a typical paper machine, the percentage of missing data is of the order of 99%. The proposed model reduces the missing data to about 50% and thus increases the reliability of the estimated model. Moreover, the proposed method ensures linear space invariance and symmetry of the cross directional response of the model with the added flexibility of using different models near the edges. The results are verified through simulations.
Estimating haplotype relative risks in a family-based study is complicated by phase ambiguity and the many parameters needed to quantify relative risks for all possible diplotypes. This problem becomes manageable if a...
详细信息
Estimating haplotype relative risks in a family-based study is complicated by phase ambiguity and the many parameters needed to quantify relative risks for all possible diplotypes. This problem becomes manageable if a particular haplotype has been implicated previously as relevant to risk. We fit log-linear models to estimate the risks associated with a candidate haplotype relative to the aggregate of other haplotypes. Our approach uses existing haplotype-reconstruction algorithms but requires assumptions about the distribution of haplotypes among triads in the source population. We consider three levels of stringency for those assumptions: Hardy-Weinberg Equilibrium (HWE), random mating, and no assumptions at all. We assessed our method's performance through simulations encompassing a range of risk haplotype frequencies, missing data patterns, and relative risks for either offspring or maternal genetic effects. The unconstrained model provides robustness to bias from population structure but requires excessively large sample sizes unless there are few haplotypes. Assuming HWE accommodates many more haplotypes but sacrifices robustness. The model assuming random mating is intermediate, both in the number of haplotypes it can handle and in robustness. To illustrate, we reanalyze data from a study of orofacial clefts to investigate a 9-SNP candidate haplotype of the IRF6 gene.
In this paper, we present a semi-supervised method for auto-annotating image collections and discovering unknown structures among them. The approach relies on the existence of only a small training database of annotat...
详细信息
ISBN:
(纸本)9781424433940
In this paper, we present a semi-supervised method for auto-annotating image collections and discovering unknown structures among them. The approach relies on the existence of only a small training database of annotated examples. First, a fully-supervised algorithm using annotated samples is presented. Next, we introduce a semi-supervised procedure which allows us to incorporate unannotated samples and to infer the existence of unknown structures, that is, the existence of new image classes which are not represented in the training database. Finally, we present experimental results from a database of satellite images and briefly mention the possibility of reusing the presented approach as a basis for more complex systems such as Content Based Image Retrieval (CBIR) systems.
Case-based reasoning is a problem-solving technique commonly seen in artificial intelligence. A successful CBR system highly depends on how to design an effective case retrieval mechanism. The K-nearest neighbor (KNN)...
详细信息
ISBN:
(纸本)9780769536538
Case-based reasoning is a problem-solving technique commonly seen in artificial intelligence. A successful CBR system highly depends on how to design an effective case retrieval mechanism. The K-nearest neighbor (KNN) search method which selects the K most similar prior cases for a new case has been extensively used in the case retrieval phase of CBR. Although KNN can be simply implemented, the choice of the K value is quite subjective and wit] influence the performance of a CBR system. To eliminate the disadvantage, this research proposes a significant nearest neighbor (SNN) search method. In SNN, the probability density function of the dissimilarity distribution is estimated by the expectation maximization algorithm. Accordingly, the case selection can be conducted by determining whether the dissimilarity between a prior case and the new case is significant low based on the estimated dissimilarity distribution. The SNN search avoids human involvement in deciding the number of retrieved prior cases and makes the retrieval result objective and meaningful in statistics. The performance of the proposed SNN search method is demonstrated through a set of experiments.
In this paper we propose a method for retrieving the Point Spread Function (PSF) of an imaging system given the observed image sections of a fluorescent microsphere. Theoretically calculated PSFs often lack the experi...
详细信息
ISBN:
(纸本)9781424439317
In this paper we propose a method for retrieving the Point Spread Function (PSF) of an imaging system given the observed image sections of a fluorescent microsphere. Theoretically calculated PSFs often lack the experimental or microscope specific signatures while empirically obtained data are either over sized or (and) too noisy. The effect of noise and the influence of the microsphere size can be mitigated from the experimental data by using a Maximum Likelihood expectationmaximization (MLEM) algorithm. The true experimental parameters can then be estimated by fitting the result to a model based on the scalar diffraction theory with lower order Spherical Aberration (SA). The algorithm was tested on some simulated data and the results obtained validate the usefulness of the approach for retrieving the PSF from measured data.
This article presents the method of the processing of mass spectrometry data. Mass spectra are modelled with Gaussian Mixture Models. Every peak of the spectrum is represented by a single Gaussian. Its parameters desc...
详细信息
ISBN:
(纸本)9781607504566
This article presents the method of the processing of mass spectrometry data. Mass spectra are modelled with Gaussian Mixture Models. Every peak of the spectrum is represented by a single Gaussian. Its parameters describe the location, height and width of the corresponding peak of the spectrum. An authorial version of the expectation Maximisation algorithm was used to perform all calculations. Errors were estimated with a virtual mass spectrometer. The discussed tool was originally designed to generate a set of spectra within defined parameters.
Gaussian mixture modeling is a recent approach in texture analysis and is used to model image textures. Texture is modeled using a mixture of Gaussian distributions, which capture the local statistical properties of t...
详细信息
ISBN:
(纸本)9781424450534
Gaussian mixture modeling is a recent approach in texture analysis and is used to model image textures. Texture is modeled using a mixture of Gaussian distributions, which capture the local statistical properties of the texture. The mixture parameters are estimated using expectation maximization algorithm. This algorithm finds the maximum likelihood estimate of the parameters of an underlying distribution from a given data set when data is incomplete. The paper presents a method of identifying changes as well as new patterns in the image using the Gaussian mixture model parameters. Model parameters of the original image texture are computed. Unexpected patterns in the image are discriminated by using weighted normalized Euclidean distance measure derived from the model parameters.
In order to investigate the performance of visual feature extraction method for automatic image annotation, three visual feature extraction methods, namely discrete cosine transform, Gabor transform and discrete wavel...
详细信息
ISBN:
(纸本)9781424427932
In order to investigate the performance of visual feature extraction method for automatic image annotation, three visual feature extraction methods, namely discrete cosine transform, Gabor transform and discrete wavelet transform, are studied in this paper. These three methods are used to extract low-level visual feature vectors from images in a given database separately, then these feature vectors are mapped to high-level semantic words to annotate images with labels in a given semantic label set. As it is more efficient to depict the visual features of an image by the feature distribution than to resort to image segmentation technology for semantic image blocks, this paper is going to find out which of the three feature extraction methods performs better in image annotation based on the distribution of feature vectors from the image. The performance of three different kinds of feature extraction method is fully analyzed, and it is found that discrete cosine transform method is more suitable for Gaussian mixture model in automatic image annotation.
Background: ChIP-chip data are routinely used to identify transcription factor binding targets. However, the presence of false positives and false negatives in ChIP-chip data complicates and hinders analyses, especial...
详细信息
Background: ChIP-chip data are routinely used to identify transcription factor binding targets. However, the presence of false positives and false negatives in ChIP-chip data complicates and hinders analyses, especially when the binding targets for a specific transcription factor are compared across conditions or species. Results: We propose an expectationmaximization based approach to infer the underlying true counts of "positives" and "negatives" from the observed counts. Based on this approach, we study the effect of false positives and false negatives on inferences related to transcription regulation. Conclusion: Our results indicate that if there is a significant degree of association among the binding targets across conditions/species (log odds ratio > 4), moderate values of false positive and false negative rates (0.005 and 0.4 respectively) would not change our inference qualitatively (i.e. the presence or absence of conservation) based on the observed experimental data despite a significant change in the observed counts. However, if the underlying association is marginal, with odds ratios close to 1, moderate to large values of false positive and false negative rates (0.01 and 0.2 respectively) could mask the underlying association.
The Genetic Analysis Workshop 16 rheumatoid arthritis data include a set of 868 cases and 1194 controls genotyped at 545,080 single-nucleotide polymorphisms (SNPs) from the Illumina 550 k chip. We focus on investigati...
详细信息
The Genetic Analysis Workshop 16 rheumatoid arthritis data include a set of 868 cases and 1194 controls genotyped at 545,080 single-nucleotide polymorphisms (SNPs) from the Illumina 550 k chip. We focus on investigating chromosomes 6 and 18, which have 35,574 and 16,450 SNPs, respectively. Association studies, including single SNP and haplotype-based analyses, were applied to the data on those two chromosomes. Specifically, we conducted a generalized linear model with regularization (rGLM) approach for detecting disease-haplotype association using unphased SNP data. A total of 444 and 43 four-SNP tests were found to be significant at the Bonferroni corrected 5% significance level on chromosome 6 and 18, respectively.
暂无评论