检索结果-内蒙古大学图书馆

Accelerating the convergence of EM-based training algorithms for RBF networks

6th International Work-Conference on Artificial and Natural Neural Networks, IWANN 2001

作者： Lázaro, Marcelino Santamaría, Ignacio Pantaleón, Carlos Dpto. Ing. Comunieaciones ETSII v Telecom Universidad de Cantabria Avda Los Castros 39005 Santander Spain

ISBN: (纸本)3540422358

In this paper, we propose a new expectation-maximization (EM) algorithm which speeds up the training of feedforward networks with local activation functions such as the Radial Basis Function (RBF) nctw ork. The core of the conditional EM algorithm for supervised learning of feedforward net w orks consists of decomposing the obserations into their individual output units and then estimating the parameters of each unit separately. In previously proposed approaches, at eac h *** the residual is decomposed equally among the units or proportionally to the weights of the output layer. However, this approach tends to slow down the training of networks with local activation units. To overcome this drawback in this paper we use a new E-step which applies a soft decomposition of the residual among the units. Inparticular, the residual is decomposed according to the probability of each RBF unit given each input-output pattern. It is shown that this variant not only speeds up the training in comparison with other EM-type1 algorithms, but also provides better results than a global gradient-descent, technique since it has the capability of avoiding some unw anted minima of the cost function. © Springer-Verlag Berlin Heidelberg 2001.

关键词： expectation maximization algorithm

来源：评论

学校读者我要写书评

暂无评论

Unsupervised ensemble learning for genome sequencing

引用

PATTERN RECOGNITION 2022年 129卷

作者： Pages-Zamora, Alba Ochoa, Idoia Cavero, Gonzalo Ruiz Villalvilla-Ornat, Pol Univ Politecn Catalunya BarcelonaTech UPC SPCOM Grp C-Jordi Girona 31 Barcelona 08034 Spain Univ Navarra Tecnun Manuel Lardizabal 13 San Sebastian 20018 Spain Swiss Fed Inst Technol Erdbeobachtung u Fernerkundung HCP G 33-1Leopold Ruzicka Weg 4 CH-8093 Zurich Switzerland

Unsupervised ensemble learning refers to methods devised for a particular task that combine data pro-vided by decision learners taking into account their reliability, which is usually inferred from the data. Here, the variant calling step of the next generation sequencing technologies is formulated as an unsuper-vised ensemble classification problem. A variant calling algorithm based on the expectation-maximization algorithm is further proposed that estimates the maximum-a-posteriori decision among a number of classes larger than the number of different labels provided by the learners. Experimental results with real human DNA sequencing data show that the proposed algorithm is competitive compared to state-of -the-art variant callers as GATK, HTSLIB, and Platypus.(c) 2022 The Author(s). Published by Elsevier *** is an open access article under the CC BY-NC-ND license ( http://***/licenses/by-nc-nd/4.0/ )

关键词： expectation maximization algorithm Variant calling Genome sequencing Unsupervised multi-class ensemble classifier GATK

来源：评论

学校读者我要写书评

暂无评论

A novel Mixture Model Method for identification of differentially expressed genes from DNA microarray data

引用

BMC BIOINFORMATICS 2004年第1期5卷 201-201页

作者： Najarian, K Zaheri, M Rad, AA Najarian, S Dargahi, J Univ N Carolina Dept Comp Sci Charlotte NC 28223 USA Amirkabir Univ Technol Comp Engn & IT Dept Tehran Iran Concordia Univ CONCAVE Res Ctr Dept Mech & Ind Engn Quebec City PQ Canada

Background: The main goal in analyzing microarray data is to determine the genes that are differentially expressed across two types of tissue samples or samples obtained under two experimental conditions. Mixture model method (MMM hereafter) is a nonparametric statistical method often used for microarray processing applications, but is known to over-fit the data if the number of replicates is small. In addition, the results of the MMM may not be repeatable when dealing with a small number of replicates. In this paper, we propose a new version of MMM to ensure the repeatability of the results in different runs, and reduce the sensitivity of the results on the parameters. Results: The proposed technique is applied to the two different data sets: Leukaemia data set and a data set that examines the effects of low phosphate diet on regular and Hyp mice. In each study, the proposed algorithm successfully selects genes closely related to the disease state that are verified by biological information. Conclusion: The results indicate 100% repeatability in all runs, and exhibit very little sensitivity on the choice of parameters. In addition, the evaluation of the applied method on the Leukaemia data set shows 12% improvement compared to the MMM in detecting the biologically-identified 50 expressed genes by Thomas et al. The results witness to the successful performance of the proposed algorithm in quantitative pathogenesis of diseases and comparative evaluation of treatment methods.

关键词： Bayesian Information Criterion expectation maximization Acute Myeloid Leukaemia Acute Lymphoblastic Leukaemia expectation maximization algorithm

来源：评论

学校读者我要写书评

暂无评论

Adaptive quantile low-rank matrix factorization

引用

PATTERN RECOGNITION 2020年 103卷 107310-107310页

作者： Xu, Shuang Zhang, Chun-Xia Zhang, Jiangshe Xi An Jiao Tong Univ Sch Math & Stat Xian 710049 Shaanxi Peoples R China

Low-rank matrix factorization (LRMF) has received much popularity owing to its successful applications in both computer vision and data mining. By assuming noise to come from a Gaussian, Laplace or mixture of Gaussian distributions, significant efforts have been made on optimizing the (weighted) L-1 or L-2-norm loss between an observed matrix and its bilinear factorization. However, the type of noise distribution is generally unknown in real applications and inappropriate assumptions will inevitably deteriorate the behavior of LRMF. On the other hand, real data are often corrupted by skew rather than symmetric noise. To tackle this problem, this paper presents a novel LRMF model called AQ-LRMF by modeling noise with a mixture of asymmetric Laplace distributions. An efficient algorithm based on the expectation-maximization (EM) algorithm is also offered to estimate the parameters involved in AQ-LRMF. The AQ-LRMF model possesses the advantage that it can approximate noise well no matter whether the real noise is symmetric or skew. The core idea of AQ-LRMF lies in solving a weighted L-1 problem with weights being learned from data. The experiments conducted on synthetic and real data sets show that AQ-LRMF outperforms several state-of-the-art techniques. Furthermore, AQ-LRMF also has the superiority over the other algorithms in terms of capturing local structural information contained in real images. (C) 2020 Elsevier Ltd. All rights reserved.

关键词： Low-rank matrix factorization Mixture of asymmetric Laplace distributions expectation maximization algorithm Skew noise

来源：评论

学校读者我要写书评

暂无评论

Effect of false positive and false negative rates on inference of binding target conservation across different conditions and species from ChIP-chip data

引用

BMC BIOINFORMATICS 2009年第1期10卷 23-23页

作者： Datta, Debayan Zhao, Hongyu Yale Univ Dept Biomed Engn New Haven CT 06520 USA Yale Univ Dept Epidemiol & Publ Hlth New Haven CT 06520 USA Yale Univ Dept Genet New Haven CT 06520 USA

Background: ChIP-chip data are routinely used to identify transcription factor binding targets. However, the presence of false positives and false negatives in ChIP-chip data complicates and hinders analyses, especially when the binding targets for a specific transcription factor are compared across conditions or species. Results: We propose an expectation maximization based approach to infer the underlying true counts of "positives" and "negatives" from the observed counts. Based on this approach, we study the effect of false positives and false negatives on inferences related to transcription regulation. Conclusion: Our results indicate that if there is a significant degree of association among the binding targets across conditions/species (log odds ratio > 4), moderate values of false positive and false negative rates (0.005 and 0.4 respectively) would not change our inference qualitatively (i.e. the presence or absence of conservation) based on the observed experimental data despite a significant change in the observed counts. However, if the underlying association is marginal, with odds ratios close to 1, moderate to large values of false positive and false negative rates (0.01 and 0.2 respectively) could mask the underlying association.

关键词： False Positive Rate Contingency Table False Negative Rate expectation maximization algorithm True Proportion

来源：评论

学校读者我要写书评

暂无评论

caBIG™ VISDA: Modeling, visualization, and discovery for cluster analysis of genomic data

引用

BMC BIOINFORMATICS 2008年第1期9卷 1-18页

作者： Zhu, Yitan Li, Huai Miller, David J. Wang, Zuyi Xuan, Jianhua Clarke, Robert Hoffman, Eric P. Wang, Yue Virginia Polytech Inst & State Univ Dept Elect & Comp Engn Arlington VA 22203 USA NIA Bioinformat Unit RRB NIH Baltimore MD 21224 USA Penn State Univ Dept Elect Engn University Pk PA 16802 USA Childrens Natl Med Ctr Med Genet Res Ctr Washington DC 20010 USA Georgetown Univ Dept Oncol Physiol & Biophys Washington DC 20007 USA Georgetown Univ Lombardi Comprehens Canc Ctr Washington DC 20007 USA

Background: The main limitations of most existing clustering methods used in genomic data analysis include heuristic or random algorithm initialization, the potential of finding poor local optima, the lack of cluster number detection, an inability to incorporate prior/expert knowledge, black-box and non-adaptive designs, in addition to the curse of dimensionality and the discernment of uninformative, uninteresting cluster structure associated with confounding variables. Results: In an effort to partially address these limitations, we develop the VIsual Statistical Data Analyzer (VISDA) for cluster modeling, visualization, and discovery in genomic data. VISDA performs progressive, coarse-to-fine ( divisive) hierarchical clustering and visualization, supported by hierarchical mixture modeling, supervised/unsupervised informative gene selection, supervised/unsupervised data visualization, and user/prior knowledge guidance, to discover hidden clusters within complex, high-dimensional genomic data. The hierarchical visualization and clustering scheme of VISDA uses multiple local visualization subspaces (one at each node of the hierarchy) and consequent subspace data modeling to reveal both global and local cluster structures in a "divide and conquer" scenario. Multiple projection methods, each sensitive to a distinct type of clustering tendency, are used for data visualization, which increases the likelihood that cluster structures of interest are revealed. Initialization of the full dimensional model is based on first learning models with user/prior knowledge guidance on data projected into the low-dimensional visualization spaces. Model order selection for the high dimensional data is accomplished by Bayesian theoretic criteria and user justification applied via the hierarchy of low-dimensional visualization subspaces. Based on its complementary building blocks and flexible functionality, VISDA is generally applicable for gene clustering, sample clustering, and phenoty

关键词： expectation maximization algorithm Minimum Description Length Locality Preserve Projection Emerin Discriminative Gene

来源：评论

学校读者我要写书评

暂无评论

A common random effect induced bivariate gamma degradation process with application to remaining useful life prediction

引用

RELIABILITY ENGINEERING & SYSTEM SAFETY 2022年第0期219卷 108200-108200页

作者： Song, Kai Cui, Lirong Beijing Inst Technol Sch Management & Econ Beijing Peoples R China Qingdao Univ Coll Qual & Standardizat Qingdao Peoples R China

Due to the complex structures and the multi-functionality of modern products, there are usually two or more performance characteristics which can reflect a product's degradation states. The degradation processes corresponding to these performance characteristics are dependent in general, which brings challenges to the degradation data analysis. In this paper, a gamma process based degradation model is developed for the bivariate dependent degradation data, where the dependency between the two degradation processes is captured by a common random effect naturally. The expectation maximization algorithm is employed to estimate the model parameters. Then, a real-time prediction method for a product's remaining useful life is proposed using the Bayesian method. Finally, both the simulation study and the case study are provided for illustration, whose results demonstrate that the proposed model as well as the corresponding inference methods does work well.

关键词： Bivariate degradation data expectation maximization algorithm Gamma process Maximum likelihood estimation Remaining useful life prediction

来源：评论

学校读者我要写书评

暂无评论

Point set registration via rigid transformation consensus

引用

COMPUTERS & ELECTRICAL ENGINEERING 2022年 101卷

作者： Li, Zhaolong Wang, Cheng Ma, Jieying Li, Zhongyu Zhu, Jihua Xi An Jiao Tong Univ Sch Software Engn Xi'an 710049 Peoples R China State Key Lab Rail Transit Engn Informatizat FSDI Xi'an 710043 Peoples R China

Point set registration is a fundamental problem in many domains. This paper proposes a novel pair-wise registration algorithm based on the rigid transformation consensus. It starts by building a point correspondence set, which contains both inliers and outliers. Due to non-overlapping regions, it associates each point correspondence with a latent variable and formulates pair-wise registration as a maximum likelihood estimation problem, which is optimized by the expectation-maximum algorithm. Since all inliers follows the consensus of one similar rigid transformation, each correspondence is assigned a posterior probability to indicate whether it is inlier or outlier. To obtain the desired result, it requires to alternatively implement the establishment of point correspondence and maximum likelihood estimation. Given initial rigid transformation, the proposed algorithm is able to obtain a desired registration result for the pair-wise registration. Experiments tested on public available data sets illustrate its superior performance on accuracy and efficiency over previous algorithms.

关键词： Point set registration expectation maximization algorithm Gaussian distribution Point correspondence Inlier Outlier

来源：评论

学校读者我要写书评

暂无评论

Centroid based clustering of high throughput sequencing reads based on n-mer counts

引用

BMC BIOINFORMATICS 2013年第1期14卷 1-21页

作者： Solovyov, Alexander Lipkin, W. Ian Columbia Univ Ctr Infect & Immun New York NY 10032 USA

Background: Many problems in computational biology require alignment-free sequence comparisons. One of the common tasks involving sequence comparison is sequence clustering. Here we apply methods of alignment-free comparison (in particular, comparison using sequence composition) to the challenge of sequence clustering. Results: We study several centroid based algorithms for clustering sequences based on word counts. Study of their performance shows that using k-means algorithm with or without the data whitening is efficient from the computational point of view. A higher clustering accuracy can be achieved using the soft expectation maximization method, whereby each sequence is attributed to each cluster with a specific probability. We implement an open source tool for alignment-free clustering. It is publicly available from github: https://***/luscinius/afcluster. Conclusions: We show the utility of alignment-free sequence clustering for high throughput sequencing analysis despite its limitations. In particular, it allows one to perform assembly with reduced resources and a minimal loss of quality. The major factor affecting performance of alignment-free read clustering is the length of the read.

关键词： expectation maximization Recall Rate expectation maximization algorithm Word Count Consensus Cluster

来源：评论

学校读者我要写书评

暂无评论

Robust unsupervised detection of action potentials with probabilistic models

引用

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING 2008年第4期55卷 1344-1354页

作者： Benitez, Raul Nenadic, Zoran Univ Calif Irvine Dept Biomed Engn Irvine CA 92697 USA Univ Calif Irvine Dept Elect Engn Irvine CA 92697 USA

We develop a robust and fully unsupervised algorithm for the detection of action potentials from extracellularly recorded data. Using the continuous wavelet transform allied to probabilistic mixture models and Bayesian probability theory, the detection of action potentials is posed as a model selection problem. Our technique provides a robust performance over a wide range of simulated conditions, and compares favorably to selected supervised and unsupervised detection techniques.

关键词： action potentials Bayesian probability theory continuous wavelet transform expectation maximization algorithm finite mixture models maximum likelihood principle receiver operating characteristic unsupervised detection

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：