Ordinal regression models as special cases of multivariate generalized linear models are extended to include random effects in the linear predictor. Random effects may describe the shifting of thresholds in the cumula...
详细信息
Ordinal regression models as special cases of multivariate generalized linear models are extended to include random effects in the linear predictor. Random effects may describe the shifting of thresholds in the cumulative model or they may describe subject-specific weights of covariates. In a more general case instead of the shifting of thresholds all thresholds may depend on the subject. The latter case makes alternative link functions advisable. Three alternative estimation procedures based on the em algorithm are considered. Two of them make use of numerical integration techniques (Gauss-Hermite or Monte Carlo), and the third one is a em type algorithm based on posterior modes. The estimation procedures are illustrated and compared by two examples.
Modern dimensional inspection techniques often produce more measurements per part than parts measured within reasonable timeframes. This poses a problem for multivariate process monitoring and capability analysis: The...
详细信息
Modern dimensional inspection techniques often produce more measurements per part than parts measured within reasonable timeframes. This poses a problem for multivariate process monitoring and capability analysis: The sample covariance matrix is not positive definite, hence not full rank and not invertible. When the measurement sites form a multidimensional lattice, spatially stationary covariance models provide positive definite estimates regardless of the number of measurements per part. I show that these estimates may be used in place of the sample covariance matrix to extend, and in some cases improve, standard multivariate methods. I describe a general class;of lattices for which positive definite estimates are obtained via simple averaging or a closed-form em algorithm. The proposed estimation and analysis procedures are illustrated in three case studies.
In the analysis of spatial data there may be a need to transfer data collected originally on one set of areal units (source regions) to a different set of areal units (target regions). The MapInfo Desktop Mapping Pack...
详细信息
In the analysis of spatial data there may be a need to transfer data collected originally on one set of areal units (source regions) to a different set of areal units (target regions). The MapInfo Desktop Mapping Package currently enables the user to carry out this process by simple areal weighting. The Enhanced Areal Interpolation method introduced by Flowerdew, Green, and Kehris, which is essentially an application of the em algorithm, postulates a more sophisticated statistical model and makes use of ancillary information available on the target regions. It is an iterative method and uses the simple areal weighting estimates as its starting point. In this paper we shall describe the implementation of this method for count data in MapInfo, thus making it readily accessible to the general user. Copyright (C) 1996 Elsevier Science Ltd
This paper studies a class of Poisson mixture models that includes, covariates in rates. This model contains Poisson regression and independent Poisson mixtures as special cases. Estimation methods based on the em and...
详细信息
This paper studies a class of Poisson mixture models that includes, covariates in rates. This model contains Poisson regression and independent Poisson mixtures as special cases. Estimation methods based on the em and quasi-Newton algorithms, properties of these estimates, a model selection procedure, residual analysis, and goodness-of-fit test are discussed. A Monte Carlo study investigates implementation and model choice issues. This methodology is used to analyze seizure frequency and Ames salmonella assay data.
A model of interval censorship of a failure time T is considered when there is only one inspection time Y. The observable data are n independent copies of the pair (Y, delta), where delta = [T less than or equal to Y]...
详细信息
A model of interval censorship of a failure time T is considered when there is only one inspection time Y. The observable data are n independent copies of the pair (Y, delta), where delta = [T less than or equal to Y]. We construct a class of self-consistent estimators of the survival function of T defined implicitly through two equations and show their strong consistency under certain conditions. The properties of the nonparametric maximum likelihood estimator are also investigated.
The effect of misclassification of phenotypes of a trait on the estimation of recombination value was investigated. The effect was larger for closer linkage. If a locus is dominant and linked with the misclassfied tra...
详细信息
The effect of misclassification of phenotypes of a trait on the estimation of recombination value was investigated. The effect was larger for closer linkage. If a locus is dominant and linked with the misclassfied trait locus in the repulsion phase, then the effect on the recombination value between the two loci is largest. A method for estimating the unbiased recombination value and the misclassification rate using maximum likelihood associated with an em algorithm is also presented. This method was applied to a numerical example from rice genome data. It was concluded that the present method combined with the metric multi-dimensional scaling method is useful for the detection of misclassified markers and for the estimation of unbiased recombination values.
There are a variety of methods in the literature which seek to make iterative estimation algorithms more manageable by breaking the iterations into a greater number of simpler or faster steps. Those algorithms which d...
详细信息
There are a variety of methods in the literature which seek to make iterative estimation algorithms more manageable by breaking the iterations into a greater number of simpler or faster steps. Those algorithms which deal at each step with a proper subset of the parameters are called in this paper partitioned algorithms. Partitioned algorithms in effect replace the original estimation problem with a series of problems of lower dimension. The purpose of the paper is to characterize some of the circumstances under which this process of dimension reduction leads to significant benefits. Four types of partitioned algorithms are distinguished: reduced objective function methods, nested (partial Gauss-Seidel) iterations, zigzag (full Gauss-Seidel) iterations, and leapfrog (non-simultaneous) iterations. emphasis is given to Newton-type methods using analytic derivatives, but a nested em algorithm is also given. Nested Newton methods are shown to be equivalent to applying to same Newton method to the reduced objective function, and are applied to separable regression and generalized linear models. Nesting is shown generally to improve the convergence of Newton-type methods, both by improving the quadratic approximation to the log-likelihood and by improving the accuracy with which the observed information matrix can be approximated. Nesting is recommended whenever a subset of parameters is relatively easily estimated. The zigzag method is shown to produce a stable but generally slow iteration;it is fast and recommended when the parameter subsets have approximately uncorrelated estimates. The leapfrog iteration has less guaranteed properties in general, but is similar to nesting and zigzagging when the parameter subsets are orthogonal.
Receiver operating characteristic (ROC) analysis is the commonly accepted method for comparing diagnostic imaging systems. In general, ROC studies are designed in such a way that multiple readers read the same images ...
详细信息
Receiver operating characteristic (ROC) analysis is the commonly accepted method for comparing diagnostic imaging systems. In general, ROC studies are designed in such a way that multiple readers read the same images and each image is presented by means of two different imaging systems. Statistical methods for the comparison of the ROC curves from one reader have been developed, but extension of these methods to multiple readers is not straightforward. A new method of analysis is presented for the comparison of ROC curves from multiple readers. This method includes a nonparametric estimation of the variances and covariances between the various areas under the curves. The method described is more appropriate than the paired t test, because it also takes the case-sample variation into account.
We derive the profile likelihood function of the mixing parameters in the discrete time mover-stayer model. This result is used to find a simple necessary and sufficient condition for the maximum likelihood estimator ...
详细信息
We derive the profile likelihood function of the mixing parameters in the discrete time mover-stayer model. This result is used to find a simple necessary and sufficient condition for the maximum likelihood estimator to take values in the interior of the parameter space. We point out the relevance of this result for the convergence properties of the em algorithm. Furthermore, the Likelihood-ratio test for the hypothesis that there are equality constraints among the mixing parameters is developed. Finally an illustration of the use of the results is given.
We generalize an approach suggested by Hill (Heredity, 33, 229-239, 1974) for testing for significant association among alleles at two loci when only genotype and not haplotype frequencies are available. The principle...
详细信息
We generalize an approach suggested by Hill (Heredity, 33, 229-239, 1974) for testing for significant association among alleles at two loci when only genotype and not haplotype frequencies are available. The principle is to use the Expectation-Maximization (em) algorithm to resolve double heterozygotes into haplotypes and then apply a likelihood ratio test in order to determine whether the resolutions of haplotypes are significantly nonrandom, which is equivalent to testing whether there is statistically significant linkage disequilibrium between loci. The em algorithm in this case relies on the assumption that genotype frequencies at each locus are in Hardy-Weinberg proportions. This method can accommodate X-linked loci and samples from haplodiploid species. We use three methods for testing the significance of the likelihood ratio: the empirical distribution in a large number of randomized data sets, the chi(2) approximation for the distribution of likelihood ratios, and the Z(2) test. The performance of each method is evaluated by applying it to simulated data sets and comparing the tail probability with the tail probability from Fisher's exact test applied to the actual haplotype data. For realistic sample sizes (50-150 individuals) all three methods perform well with two or three alleles per locus, but only the empirical distribution is adequate when there are five to eight alleles per locus, as is typical of hypervariable loci such as microsatellites. The method is applied to a data set of 32 microsatellite loci in a Finnish population and the results confirm the theoretical predictions. We conclude that with highly polymorphic loci, the em algorithm does lead to a useful test for linkage disequilibrium, but that it is necessary to find the empirical distribution of likelihood ratios in order to perform a test of significance correctly.
暂无评论