Count data is often modeled using Poisson regression, although this probability model naturally restricts the conditional variance to be equal to the conditional mean (equidispersion property). While overdispersion ha...
详细信息
Count data is often modeled using Poisson regression, although this probability model naturally restricts the conditional variance to be equal to the conditional mean (equidispersion property). While overdispersion has been intensively studied, there are few alternative models in the statistical literature for analyzing count data with underdispersion. The primary goal of this paper is to introduce a novel model based on Bernoulli-Poisson convolution for modelling count data that are underdispersed relative to the Poisson distribution. We study the statistical properties of the proposed model, and we provide a useful interpretation of the parameters. We consider a regression structure for both components based on a new parameterization indexed by mean and dispersion parameters. An expectation-maximization (em) algorithm is proposed for parameter estimation and some diagnostic measures, based on the emalgorithm, are considered. Simulation studies are conducted to evaluate its finite sample performance. Finally, we illustrate the usefulness of the new regression model by an application.
A multivariate normal mean-variance mixture based on a Birnbaum-Saunders (NMVMBS) distribution is introduced and several properties of this new distribution are discussed. A new robust non-Gaussian ARCH-type model is ...
详细信息
A multivariate normal mean-variance mixture based on a Birnbaum-Saunders (NMVMBS) distribution is introduced and several properties of this new distribution are discussed. A new robust non-Gaussian ARCH-type model is proposed in which there exists a relation between the variance of the observations, and the marginal distributions are NMVMBS. A simple em-based maximum likelihood estimation procedure to estimate the parameters of this normal mean-variance mixture distribution is given. A simulation study and some real data are used to demonstrate the modelling strength of this new model.
We consider estimation and testing of linkage equilibrium from genotypic data on a random sample of sibs, such as monozygotic and dizygotic twins. We compute the maximum likelihood estimator with an em-algorithm and a...
详细信息
We consider estimation and testing of linkage equilibrium from genotypic data on a random sample of sibs, such as monozygotic and dizygotic twins. We compute the maximum likelihood estimator with an em-algorithm and a likelihood ratio statistic that takes the family structure into account. As we are interested in applying this to twin data we also allow observations on single children, so that monozygotic twins can be included. We allow non-zero recombination fraction between the loci of interest, so that linkage disequilibrium between both linked and unlinked loci can be tested. The em-algorithm for computing the maximum likelihood estimator of the haplotype frequencies and the likelihood ratio test-statistic, are described in detail. It is shown that the usual estimators of haplotype frequencies based on ignoring that the sibs are related are inefficient, and the likelihood ratio test for testing that the loci are in linkage disequilibrium.
We discuss an interpretation of the mixture transition distribution (MTD) for discrete-valued time series which is based on a sequence of independent latent variables which are occasion-specific. We show that, by assu...
详细信息
We discuss an interpretation of the mixture transition distribution (MTD) for discrete-valued time series which is based on a sequence of independent latent variables which are occasion-specific. We show that, by assuming that this latent process follows a first order Markov Chain, MTD can be generalized in a sensible way. A class of models results which also includes the hidden Markov model (HMM). For these models we outline an emalgorithm for the maximum likelihood estimation which exploits recursions developed within the HMM literature. As an illustration, we provide an example based on the analysis of stock market data referred to different American countries.
A critical issue in modeling binary response data is the choice of the links. We introduce a new link based on the Student's t-distribution (t-link) for correlated binary data. The t-link relates to the common pro...
详细信息
A critical issue in modeling binary response data is the choice of the links. We introduce a new link based on the Student's t-distribution (t-link) for correlated binary data. The t-link relates to the common probit-normal link adding one additional parameter which controls the heaviness of the tails of the link. We propose an interesting emalgorithm for computing the maximum likelihood for generalized linear mixed t-link models for correlated binary data. In contrast with recent developments (Tan et al. in J. Stat. Comput. Simul. 77:929-943, 2007;Meza et al. in Comput. Stat. Data Anal. 53:1350-1360, 2009), this algorithm uses closed-form expressions at the E-step, as opposed to Monte Carlo simulation. Our proposed algorithm relies on available formulas for the mean and variance of a truncated multivariate t-distribution. To illustrate the new method, a real data set on respiratory infection in children and a simulation study are presented.
Polymerase chain reaction (PCR) based tests are commonly used to diagnose various infections. Such tests are assumed to be highly 'sensitive', however, no consensus definition of, or method for estimating, sen...
详细信息
Polymerase chain reaction (PCR) based tests are commonly used to diagnose various infections. Such tests are assumed to be highly 'sensitive', however, no consensus definition of, or method for estimating, sensitivity exists. Hughes and Totten proposed that sensitivity be defined as a function of the number of target DNA molecules in the sample with specificity corresponding to the case where there is no target DNA molecule present. They then developed parametric, non-parametric and semi-parametric models for estimating the sensitivity curve. In this paper a general model is proposed that yields their three models as special cases when specificity is assumed to be 1.0. We also extend the general model to incorporate covariates. Simulation studies are used to compare the different estimators. The methods are applied to data from a PCR-based test for Mycoplasma genitalium. Copyright (c) 2004 John Wiley & Sons, Ltd.
This letter further explores the Bayesian Ying-Yang learning based non-Gaussian factor analysis (NFA) via investigating its key yet analytically intractable factor estimating step. Among the three suggested numerical ...
详细信息
This letter further explores the Bayesian Ying-Yang learning based non-Gaussian factor analysis (NFA) via investigating its key yet analytically intractable factor estimating step. Among the three suggested numerical approaches we empirically show that the so-called iterative fixed posteriori approximation approach is the most optimal, as well as theoretically prove that the iterative fixed posteriori approximation is another type of em-algorithm, with the proof of its convergence also shown.
While sophisticated neural networks and graphical models have been developed for predicting conditional probabilities in a non-stationary environment, major improvements in the training schemes are still required to m...
详细信息
While sophisticated neural networks and graphical models have been developed for predicting conditional probabilities in a non-stationary environment, major improvements in the training schemes are still required to make these approaches practically viable. (C) 2000 Elsevier Science Ltd. All rights reserved.
This paper explores how localized mixture models can be used for prediction using time series data. The estimation method presented in this study is a kernel-weighted version of an em-algorithm, where exponential kern...
详细信息
This paper explores how localized mixture models can be used for prediction using time series data. The estimation method presented in this study is a kernel-weighted version of an em-algorithm, where exponential kernels with different bandwidths are used as weight functions. Nadaraya-Watson and local linear estimators are used to carry out localized estimations. Furthermore, in order to demonstrate suitability for prediction at a future time point, a methodology for bandwidth selection and adequate methods are outlined for each model, and then compared with competing forecasting routines. A simulation study is executed to assess the performance of these models for prediction. Furthermore, real data is used to investigate the performance of the localized mixture models for prediction. The data used is predominately taken from the International Energy Agency (IEA).
The problem of assessing the relative calibrations and relative accuracies of a set of p instruments, each designed to measure the same characteristic on a common group of n individuals is considered. Two models have ...
详细信息
The problem of assessing the relative calibrations and relative accuracies of a set of p instruments, each designed to measure the same characteristic on a common group of n individuals is considered. Two models have been proposed in the literature to analyse data from such experiments. One, which we call the regression model version, was introduced by Barnett (1969) and the other, the factor analysis version, was studied by Theobald and Mallison (1978). All these models assume errors normally distributed. In this paper the normal distribution is replaced by the t-distribution. Estimation is approached via the em-algorithm. Relationships between the two models are explored which allow passing estimators and the information matrix from one model to the other. The estimating algorithm and information matrix are developed under the factor analysis version, which is more easily handled computationally, and inference performed under the regression model, version which is more easily interpretable. The approach developed can also be used in factor analysis models with one factor and the normal distribution replaced by the t distribution.
暂无评论