检索结果-内蒙古大学图书馆

New expectation-maximization-type algorithms via stochastic representation for the analysis of truncated normal data with applications in biomedicine

引用

STATISTICAL METHODS IN MEDICAL RESEARCH 2018年第8期27卷 2459-2477页

作者： Tian, Guo-Liang Ju, Da Yuen, Kam Chuen Zhang, Chi Southern Univ Sci & Technol Dept Math Shenzhen Peoples R China Univ Hong Kong Dept Stat & Actuarial Sci Pokfulam Rd Hong Kong Hong Kong Peoples R China

To analyze univariate truncated normal data, in this paper, we stochastically represent the normal random variable as a mixture of a truncated normal random variable and its complementary random variable. This stochastic representation is a new idea and it is the first time to appear in literature. According to this stochastic representation, we derive important distributional properties for the truncated normal distribution and develop two new expectation-maximization algorithms to calculate the maximum likelihood estimates of parameters of interest for Type I data (without and with covariates) and Type II/III data. Bootstrap confidence intervals of parameters for small sample sizes are provided. To evaluate the performance of the proposed methods for the truncated normal distribution, in simulation studies, we first focus on the comparison of estimation results between including the unobserved data counts and excluding the unobserved data counts, and we next investigate the impact of the number of unobserved data on the estimation results. The plasma ferritin concentration data collected by Australian Institute of Sport and the blood fat content data are used to illustrate the proposed methods and to compare the truncated normal distribution with the half normal, the folded normal, and the folded normal slash distributions based on Akaike information criterion and Bayesian information criterion.

关键词： AIC BIC bootstrap method complementary random variable em algorithm stochastic representation truncated normal distribution

来源：评论

学校读者我要写书评

暂无评论

Accelerated Estimation of Switching algorithms: The Cointegrated VAR Model and Other Applications

引用

SCANDINAVIAN JOURNAL OF STATISTICS 2018年第2期45卷 283-300页

作者： Doornik, Jurgen A. Univ Oxford Oxford Martin Sch Econ Dept Oxford England Univ Oxford Oxford Martin Sch Inst New Econ Thinking Oxford England

Restricted versions of the cointegrated vector autoregression are usually estimated using switching algorithms. These algorithms alternate between two sets of variables but can be slow to converge. Acceleration methods are proposed that combine simplicity and effectiveness. These methods also outperform existing proposals in some applications of the expectation-maximization method and parallel factor analysis.

关键词： acceleration cointegration em algorithm line search maximization PARAFAC switching algorithm

来源：评论

学校读者我要写书评

暂无评论

A GMM-MRF Based Image Segmentation approach for Interface Level Estimation ⁎

引用

IFAC-PapersOnLine 2019年第1期52卷 28-33页

作者： Zheyuan Liu Hariprasad Kodamana Artin Afacan Biao Huang Department of Chemical and Materials Engineering University of Alberta Edmonton AB T6G 2R3 CANADA Department of Chemical Engineering Indian Institute of Technology Delhi New Delhi 110016 India

Maintaining the desired interface level between the top froth layer and the liquid layer plays an important role in achieving high recovery of products in oil sands and related process industries. As varying throughputs and downstream disturbances tend to change the interface level over time, it is an important indicator of the process behavior. In this paper, we propose an approach based on Gaussian mixture model and Markov Random Field (MRF) based unsupervised image segmentation to achieve the real-time accurate measurement of the interface. The image processing problem is solved as a Maximum a Posteriori (MAP) estimation problem employing the MRF framework and the parameters are estimated using the em algorithm. The proposed approach is validated using the images captured from a laboratory scale equipment designed to simulate the industrial PSV interface.

关键词： Interface level Image segmentation Markov random field em algorithm

来源：评论

学校读者我要写书评

暂无评论

Two-scale spatial models for binary data

引用

STATISTICAL METHODS AND APPLICATIONS 2018年第1期27卷 1-24页

作者： Hardouin, Cecile Cressie, Noel Univ Paris Nanterre MODALX Paris France Univ Wollongong Sch Math & Appl Stat NIASRA Wollongong NSW Australia

A spatial lattice model for binary data is constructed from two spatial scales linked through conditional probabilities. A coarse grid of lattice locations is specified, and all remaining locations (which we call the background) capture fine-scale spatial dependence. Binary data on the coarse grid are modelled with an autologistic distribution, conditional on the binary process on the background. The background behaviour is captured through a hidden Gaussian process after a logit transformation on its Bernoulli success probabilities. The likelihood is then the product of the (conditional) autologistic probability distribution and the hidden Gaussian-Bernoulli process. The parameters of the new model come from both spatial scales. A series of simulations illustrates the spatial-dependence properties of the model and likelihood-based methods are used to estimate its parameters. Presence-absence data of corn borers in the roots of corn plants are used to illustrate how the model is fitted.

关键词： Auto-logistic model em algorithm Gaussian process Hierarchical statistical model Laplace approximation Spatial odds-ratio

来源：评论

学校读者我要写书评

暂无评论

A probabilistic procedure for estimating an optimal echo-integration threshold using the Expectation-Maximisation algorithm

引用

AQUATIC LIVING RESOURCES 2018年第12期31卷 1-N.PAG页

作者： Lopez-Serrano, Antonio Villalobos, Hector Nevarez-Martinez, Manuel O. Univ Mar Ciudad Univ Puerto Angel 70902 Oaxaca Mexico Inst Politecn Nacl CICIMAR Av IPN S-N Col Playa Palo de Santa Rita La Paz 23090 Bcs Mexico Inst Nacl Pesca CRIP Unidad Guaymas Calle 20 Col Cantera Guaymas 85400 Sonora Mexico

To obtain reliable fish biomass estimates by acoustic methods, it is essential to filter out the signals from unwanted scatterers (e. g. zooplankton). When acoustic data are collected at more than one frequency, methods that exploit the differences in reflectivity of scatterers can be used to achieve the separation of targets. These methods cannot be applied with historical data nor recent data collected on board fishing vessels employed as scientific platforms, where only one transducer is available. Instead, a volume backscattering strength (S-v) threshold is set to separate fish from plankton, both for echogram visualisation or, more importantly, during echo-integration. While empirical methods exist for selecting a threshold, it often depends on the subjective decision of the user. A-47 dB threshold was empirically established in 2008 at the beginning of a series of surveys conducted by Mexico's National Fisheries Institute to assess the biomass of Pacific sardine in the Gulf of California. Until 2012, when a 120 kHz transducer was installed, only data collected at 38 kHz are available. Here, we propose a probabilistic procedure to estimate an optimal S-v threshold using the Expectation-Maximisation algorithm for fitting a mixture of Gaussian distributions to S-v data sampled from schools associated with small pelagic fish and their surrounding echoes. The optimal threshold is given by the Bayes decision function for classifying an S-v value in one of the two groups. The procedure was implemented in the R language environment. The optimal threshold found for 38 kHz data was -59.4 dB, more than 12 dB lower than the currently used value. This difference prompts the need to revise the acoustic biomass estimates of small pelagics in the Gulf of California.

关键词： S-v threshold echo-integration em algorithm small pelagic fish Gulf of California

来源：评论

学校读者我要写书评

暂无评论

Circular-circular regression model with a spike at zero

引用

STATISTICS IN MEDICINE 2018年第1期37卷 71-81页

作者： Jha, Jayant Biswas, Atanu Indian Stat Inst Appl Stat Unit 203 BT Rd Kolkata 700108 India

With reference to a real data on cataract surgery, we discuss the problem of zero-inflated circular-circular regression when both covariate and response are circular random variables and a large proportion of the responses are zeros. The regression model is proposed, and the estimation procedure for the parameters is discussed. Some relevant test procedures are also suggested. Simulation studies and real data analysis are performed to illustrate the applicability of the model.

关键词： em algorithm Mobius transformation von Mises distribution wrapped Cauchy distribution

来源：评论

学校读者我要写书评

暂无评论

Estimation for zero-inflated beta-binomial regression model with missing response data

引用

STATISTICS IN MEDICINE 2018年第26期37卷 3789-3813页

作者： Luo, Rong Paul, Sudhir Univ Windsor Dept Math & Stat Windsor ON Canada

Discrete data in the form of proportions with overdispersion and zero inflation can arise in toxicology and other similar fields. In regression analysis of such data, another problem that also may arise in practice is that some responses may be missing. In this paper, we develop estimation procedure for the parameters of a zero-inflated overdispersed binomial model in the presence of missing responses under three different missing data mechanisms. A weighted expectation maximization algorithm is used for the maximum likelihood estimation of the parameters involved. Extensive simulations are conducted to study the properties of the estimates in terms of average of estimates, relative bias, variance, mean squared error, and coverage probability of estimates. Simulations show much superior properties of the estimates obtained using the weighted expectation maximization algorithm. Some illustrative examples and a discussion are given.

关键词： binomial data em algorithm overdispersion regression model zero inflation

来源：评论

学校读者我要写书评

暂无评论

MODELING HYBRID TRAITS FOR COMORBIDITY AND GENETIC STUDIES OF ALCOHOL AND NICOTINE CO-DEPENDENCE

引用

ANNALS OF APPLIED STATISTICS 2018年第4期12卷 2359-2378页

作者： Zhang, Heping Liu, Dungang Zhao, Jiwei Bi, Xuan Yale Sch Publ Hlth Dept Biostat New Haven CT 06520 USA Univ Cincinnati Dept Operat Business Analyt & Informat Syst Linder Coll Business Cincinnati OH 45221 USA SUNY Buffalo Dept Biostat Buffalo NY 14214 USA

We propose a novel multivariate model for analyzing hybrid traits and identifying genetic factors for comorbid conditions. Comorbidity is a common phenomenon in mental health in which an individual suffers from multiple disorders simultaneously. For example, in the Study of Addiction: Genetics and Environment (SAGE), alcohol and nicotine addiction were recorded through multiple assessments that we refer to as hybrid traits. Statistical inference for studying the genetic basis of hybrid traits has not been well developed. Recent rank-based methods have been utilized for conducting association analyses of hybrid traits but do not inform the strength or direction of effects. To overcome this limitation, a parametric modeling framework is imperative. Although such parametric frameworks have been proposed in theory, they are neither well developed nor extensively used in practice due to their reliance on complicated likelihood functions that have high computational complexity. Many existing parametric frameworks tend to instead use pseudo-likelihoods to reduce computational burdens. Here, we develop a model fitting algorithm for the full likelihood. Our extensive simulation studies demonstrate that inference based on the full likelihood can control the type-I error rate, and gains power and improves the effect size estimation when compared with several existing methods for hybrid models. These advantages remain even if the distribution of the latent variables is misspecified. After analyzing the SAGE data, we identify three genetic variants (rs7672861, rs958331, rs879330) that are significantly associated with the comorbidity of alcohol and nicotine addiction at the chromosome-wide level. Moreover, our approach has greater power in this analysis than several existing methods for hybrid *** the analysis of the SAGE data motivated us to develop the model, it can be broadly applied to analyze any hybrid responses.

关键词： Comorbidity association em algorithm latent variable ordinal outcome

来源：评论

学校读者我要写书评

暂无评论

Inverse regression approach to robust nonlinear high-to-low dimensional mapping

引用

JOURNAL OF MULTIVARIATE ANALYSIS 2018年 163卷 1-14页

作者： Perthame, emeline Forbes, Florence Deleforge, Antoine Univ Grenoble Alpes CNRS Inria LJK F-38000 Grenoble France Inria Rennes France

The goal of this paper is to address the issue of nonlinear regression with outliers, possibly in high dimension, without specifying the form of the link function and under a parametric approach. Nonlinearity is handled via an underlying mixture of affine regressions. Each regression is encoded in a joint multivariate Student distribution on the responses and covariates. This joint modeling allows the use of an inverse regression strategy to handle the high dimensionality of the data, while the heavy tail of the Student distribution limits the contamination by outlying data. The possibility to add a number of latent variables similar to factors to the model further reduces its sensitivity to noise or model misspecification. The mixture model setting has the advantage of providing a natural inference procedure using an em algorithm. The tractability and flexibility of the algorithm are illustrated in simulations and real high-dimensional data with good performance that compares favorably with other existing methods. (C) 2017 Elsevier Inc. All rights reserved.

关键词： em algorithm Inverse regression Mixture of regressions Nonlinear regression High dimension Robust regression Student distribution

来源：评论

学校读者我要写书评

暂无评论

Finding the Number of Normal Groups in Model-Based Clustering via Constrained Likelihoods

引用

JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS 2018年第2期27卷 404-416页

作者： Cerioli, Andrea Angel Garcia-Escudero, Luis Mayo-Iscar, Agustin Riani, Marco Univ Parma Dipart Sci Econ & Aziendali Parma Italy Univ Valladolid Dept Estadist & IO Valladolid Spain Univ Valladolid IMUVA Valladolid Spain

Deciding the number of clusters k is one of the most difficult problems in cluster analysis. For this purpose, complexity-penalized likelihood approaches have been introduced in model-based clustering, such as the well-known Bayesian information criterion and integrated complete likelihood criteria. However, the classification/mixture likelihoods considered in these approaches are unbounded without any constraint on the cluster scatter matrices. Constraints also prevent traditional em and Cem algorithms from being trapped in (spurious) local maxima. Controlling the maximal ratio between the eigenvalues of the scatter matrices to be smaller than a fixed constant c >= 1 is a sensible idea for setting such constraints. A new penalized likelihood criterion which takes into account the higher model complexity that a higher value of c entails is proposed. Based on this criterion, a novel and fully automated procedure, leading to a small ranked list of optimal (k, c) couples is provided. A new plot called "car-bike," which provides a concise summary of the solutions, is introduced. The performance of the procedure is assessed both in empirical examples and through a simulation study as a function of cluster overlap. Supplementary materials for the article are available online.

关键词： BIC Cem algorithm Clustering em algorithm ICL Mixtures

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：