检索结果-内蒙古大学图书馆

Algebraic methods for polynomial statistical models

STATISTICS AND COMPUTING 2002年第4期12卷 307-314页

作者： Dinwoodie, IH Tulane Univ Dept Math New Orleans LA 70118 USA

We describe applications of computational algebra to statistical problems of parameter identifiability, sufficiency, and estimation. The methods work for a family of statistical models that includes Poisson and binomi... 详细信息

关键词： em-algorithm Grobner basis identifiability maximum likelihood network tomography sufficient statistic

来源：评论

学校读者我要写书评

暂无评论

A general trimming approach to robust cluster analysis

引用

ANNALS OF STATISTICS 2008年第3期36卷 1324-1345页

作者： Garcia-Escudero, Luis A. Gordaliza, Alfonso Matran, Carlos Mayo-Iscar, Agustin Univ Valladolid Dept Estadist & Invest Operat E-47005 Valladolid Spain

We introduce a new method for performing clustering with the aim of fitting clusters with different scatters and weights. It is designed by allowing to handle a proportion alpha of contaminating data to guarantee the robustness of the method. As a characteristic feature, restrictions on the ratio between the maximum and the minimum eigenvalues of the groups scatter matrices are introduced. This makes the problem to be well defined and guarantees the consistency of the sample solutions to the population ones. The method covers a wide range of clustering approaches depending on the strength of the chosen restrictions. Our proposal includes an algorithm for approximately solving the sample problem.

关键词： robustness cluster analysis trimming asymprotics trimmed k-means em-algorithm fast-MCD algorithm Dykstra's algorithm

来源：评论

学校读者我要写书评

暂无评论

Using hidden Markov chains and empirical Bayes change-point estimation for transect data

引用

ENVIRONMENTAL AND ECOLOGICAL STATISTICS 1997年第3期4卷 247-264页

作者： Hoef, JMV Cressie, N IOWA STATE UNIV DEPT STATAMESIA 50011

Consider a lattice of locations in one dimension at which data are observed. We model the data as a random hierarchical process. The hidden process is assumed to have a (prior) distribution that is derived from a two-state Markov chain. The states correspond to the mean values (high and low) of the observed data. Conditional on the states, the observations are modelled, for example, as independent Gaussian random variables with identical variances. In this model, there are four free parameters: the Gaussian variance, the high and low mean values, and the transition probability in the Markov chain. A parametric empirical Bayes approach requires estimation of these four parameters from the marginal (unconditional) distribution of the data and we use the em algorithm to do this. From the posterior of the hidden process, we use simulated annealing to find the maximum a posteriori (MAP) estimate. Using a Gibbs sampler, we also obtain the maximum marginal posterior probability (MMPP) estimate of the hidden process. We use these methods to determine where change-points occur in spatial transects through grassland vegetation, a problem of considerable interest to plant ecologists.

关键词： spatial statistics image analysis em-algorithm simulated annealing Gibbs sampler

来源：评论

学校读者我要写书评

暂无评论

Haplotype reconstruction for genetically complex regions with ambiguous genotype calls: Illustration by the KIR gene region

引用

GENETIC EPIDemIOLOGY 2024年第1期48卷 3-26页

作者： van der Burg, Lars L. J. de Wreede, Liesbeth C. Baldauf, Henning Sauter, Juergen Schetelig, Johannes Putter, Hein Boehringer, Stefan LUMC Biomed Data Sci Leiden Netherlands DKMS Dresden Germany Univ Hosp Carl Gustav Carus Dept Internal Med 1 Dresden Germany

Advances in DNA sequencing technologies have enabled genotyping of complex genetic regions exhibiting copy number variation and high allelic diversity, yet it is impossible to derive exact genotypes in all cases, often resulting in ambiguous genotype calls, that is, partially missing data. An example of such a gene region is the killer-cell immunoglobulin-like receptor (KIR) genes. These genes are of special interest in the context of allogeneic hematopoietic stem cell transplantation. For such complex gene regions, current haplotype reconstruction methods are not feasible as they cannot cope with the complexity of the data. We present an expectation-maximization (em)-algorithm to estimate haplotype frequencies (HTFs) which deals with the missing data components, and takes into account linkage disequilibrium (LD) between genes. To cope with the exponential increase in the number of haplotypes as genes are added, we add three components to a standard em-algorithm implementation. First, reconstruction is performed iteratively, adding one gene at a time. Second, after each step, haplotypes with frequencies below a threshold are collapsed in a rare haplotype group. Third, the HTF of the rare haplotype group is profiled in subsequent iterations to improve estimates. A simulation study evaluates the effect of combining information of multiple genes on the estimates of these frequencies. We show that estimated HTFs are approximately unbiased. Our simulation study shows that the em-algorithm is able to combine information from multiple genes when LD is high, whereas increased ambiguity levels increase bias. Linear regression models based on this em, show that a large number of haplotypes can be problematic for unbiased effect size estimation and that models need to be sparse. In a real data analysis of KIR genotypes, we compare HTFs to those obtained in an independent study. Our new em-algorithm-based method is the first to account for the full genetic architecture of compl

关键词： em-algorithm haplotype reconstruction KIR genes

来源：评论

学校读者我要写书评

暂无评论

Conditional Logistic Regression With Longitudinal Follow-up and Individual-Level Random Coefficients: A Stable and Efficient Two-Step Estimation Method

引用

JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS 2011年第3期20卷 767-784页

作者： Craiu, Radu V. Duchesne, Thierry Fortin, Daniel Baillargeon, Sophie Univ Toronto Dept Stat Toronto ON M5S 3G3 Canada Univ Laval Dept Math & Stat Quebec City PQ G1V 0A6 Canada Univ Laval Ctr Etud Foret Quebec City PQ G1V 0A6 Canada Univ Laval Dept Biol Quebec City PQ G1V 0A6 Canada

The analysis of data generated by animal habitat selection studies, by family studies of genetic diseases, or by longitudinal follow-up of households often involves fitting a mixed conditional logistic regression model to longitudinal data composed of clusters of matched case-control strata. The estimation of model parameters by maximum likelihood is especially difficult when the number of cases per stratum is greater than one. In this case, the denominator of each cluster contribution to the conditional likelihood involves a complex integral in high dimension, which leads to convergence problems in the numerical maximization. In this article we show how these computational complexities can be bypassed using a global two-step analysis for nonlinear mixed effects models. The first step estimates the cluster-specific parameters and can be achieved with standard statistical methods and software based on maximum likelihood for independent data. The second step uses the em-algorithm in conjunction with conditional restricted maximum likelihood to estimate the population parameters. We use simulations to demonstrate that the method works well when the analysis is based on a large number of strata per cluster, as in many ecological studies. We apply the proposed two-step approach to evaluate habitat selection by pairs of bison roaming freely in their natural environment. This article has supplementary material online.

关键词： CRemL em-algorithm Habitat selection Mixed effects Mixed multinomial logit One-step estimator RemL Two-step analysis

来源：评论

学校读者我要写书评

暂无评论

Modified likelihood ratio test in finite mixture models with a structural parameter

引用

JOURNAL OF STATISTICAL PLANNING AND INFERENCE 2005年第1-2期129卷 93-107页

作者： Chen, JH Kalbfleisch, JD Univ Waterloo Dept Stat & Actuarial Sci Waterloo ON N2L 3G1 Canada Univ Michigan Sch Publ Hlth Dept Biostat Ann Arbor MI 48109 USA

The finite mixture model is an example of a non-regular parametric family, and most classical asymptotic results cannot be directly applied. In particular, the asymptotic properties of likelihood ratio statistics for testing for the number of subpopulations are complicated and difficult to establish. One approach that has been found to simplify the asymptotic results while preserving the power of the test is to modify the likelihood function by incorporating a penalty term to avoid boundary problems. The asymptotic properties and the use of likelihood ratio results are even more difficult when an unknown structural parameter is involved in the model. In this paper, we study an application of the modified likelihood approach to finite normal mixture models with a common and unknown variance in the mixing components and consider a test of the hypothesis of a homogeneous model versus a mixture on two or more components. We show that the X-2(2) distribution is a stochastic lower bound to the limiting distribution of the likelihood ratio statistic. This same distribution is also shown to be a stochastic upper bound to the limiting distribution of the modified likelihood ratio statistic. A small simulation study suggests that both bounds are relatively tight and practically useful. An example from genetics is used to illustrate the technique. (C) 2004 Elsevier B.V. All rights reserved.

关键词： em-algorithm Hardy-Weinberg law normal mixture stochastic bounds

来源：评论

学校读者我要写书评

暂无评论

Birnbaum-Saunders frailty regression models for clustered survival data

引用

STATISTICS AND COMPUTING 2024年第4期34卷 141-141页

作者： Gallardo, Diego I. Bourguignon, Marcelo Romeo, Jose S. Univ Bio Bio Fac Ciencias Dept Estadist Concepcion Chile Univ Fed Rio Grande do Norte Dept Estat Natal RN Brazil Massey Univ Coll Hlth Social & Hlth Outcomes Res & Evaluat SHORE Auckland New Zealand Massey Univ Coll Hlth Whariki Res Ctr Auckland New Zealand

We present a novel frailty model for modeling clustered survival data. In particular, we consider the Birnbaum-Saunders (BS) distribution for the frailty terms with a new directly parameterized on the variance of the frailty distribution. This allows, among other things, compare the estimated frailty terms among traditional models, such as the gamma frailty model. Some mathematical properties of the new model are studied including the conditional distribution of frailties among the survivors, the frailty of individuals dying at time t, and the Kendall's tau\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\tau $$\end{document} measure. Furthermore, an explicit form to the derivatives of the Laplace transform for the BS distribution using the di Bruno's formula is found. Parametric, non-parametric and semiparametric versions of the BS frailty model are studied. We use a simple Expectation-Maximization (em) algorithm to estimate the model parameters and evaluate its performance under different censoring proportion by a Monte Carlo simulation study. We also show that the BS frailty model is competitive over the gamma and weighted Lindley frailty models under misspecification. We illustrate our methodology by using a real data sets.

关键词： Censored data Clustered survival data em-algorithm Frailty models Generalized inverse-Gaussian model

来源：评论

学校读者我要写书评

暂无评论

On a reparameterization of a flexible family of cure models

引用

STATISTICS IN MEDICINE 2022年第21期41卷 4091-4111页

作者： Milienos, Fotios S. Panteion Univ Social & Polit Sci Dept Sociol 136 Syngrou Ave Athens 17671 Greece

The existence of items not susceptible to the event of interest is of both theoretical and practical importance. Although researchers may provide, for example, biological, medical, or sociological evidence for the presence of such items (cured), statistical models performing well under the existence or not of a cured proportion, frequently offer a necessary flexibility. This work introduces a new reparameterization of a flexible family of cure models, which not only includes among its special cases, the most studied cure models (such as the mixture, bounded cumulative hazard, and negative binomial cure model) but also classical survival models (ie, without cured items). One of the main properties of the proposed family, apart from its computationally tractable closed form, is that the case of zero cured proportion is not found at the boundary of the parameter space, as it typically happens to other families. A simulation study examines the (finite) performance of the suggested methodology, focusing to the estimation through em algorithm and model discrimination, by the aid of the likelihood ratio test and Akaike information criterion;for illustrative purposes, analysis of two real life datasets (on recidivism and cutaneous melanoma) is also carried out.

关键词： bounded cumulative hazard cure model censored lifetime data cure models em-algorithm likelihood ratio test melanoma data mixture cure model recidivism data

来源：评论

学校读者我要写书评

暂无评论

Covariate Adaptive False Discovery Rate Control With Applications to Omics-Wide Multiple Testing

引用

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION 2022年第537期117卷 411-427页

作者： Zhang, Xianyang Chen, Jun Texas A&M Univ Dept Stat College Stn TX 77843 USA Mayo Clin Div Biomed Stat & Informat Rochester MN 55901 USA Mayo Clin Ctr Individualized Med Rochester MN 55901 USA

Conventional multiple testing procedures often assume hypotheses for different features are exchangeable. However, in many scientific applications, additional covariate information regarding the patterns of signals and nulls are available. In this article, we introduce an FDR control procedure in large-scale inference problem that can incorporate covariate information. We develop a fast algorithm to implement the proposed procedure and prove its asymptotic validity even when the underlying likelihood ratio model is misspecified and the p-values are weakly dependent (e.g., strong mixing). Extensive simulations are conducted to study the finite sample performance of the proposed method and we demonstrate that the new approach improves over the state-of-the-art approaches by being flexible, robust, powerful, and computationally efficient. We finally apply the method to several omics datasets arising from genomics studies with the aim to identify omics features associated with some clinical and biological phenotypes. We show that the method is overall the most powerful among competing methods, especially when the signal is sparse. The proposed covariate adaptive multiple testing procedure is implemented in the R package CAMT. Supplementary materials for this article are available online.

关键词： Covariates em-algorithm False discovery rate Multiple testing

来源：评论

学校读者我要写书评

暂无评论

ANALYSIS OF BINARY DATA FROM A MULTICENTER CLINICAL-TRIAL

引用

BIOMETRIKA 1993年第1期80卷 127-139页

作者： RAGHUNATHAN, TE II, YC Department of Biostatistics SC-32 University of Washington Seattle Washington 98195 U.S.A.

We develop several methods for estimating the treatment effect difference defined as the overall log-odds ratio of favourable response in a multicentre clinical trial comparing two treatments with binary response. A simulation study compares the bias and mean squared error of the point estimates and the exact coverage probabilities of confidence intervals obtained distributions.

关键词： CATEGORICAL DATA CONDITIONAL LIKELIHOOD em-algorithm emPIRICAL BAYES, MIXED MODEL

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：