In this paper we present a logistic mixture model for rain rate, that is, a model where the regime probabilities are allowed to change over time and are modeled with a logistic regression structure. Such a model may b...
详细信息
The paper mainly aims to extend the bivariate generalized exponential distribution into multivariate exponential distribution. It also provides the explicit forms of the joint cumulative distribution function and join...
详细信息
The paper mainly aims to extend the bivariate generalized exponential distribution into multivariate exponential distribution. It also provides the explicit forms of the joint cumulative distribution function and joint probability distribution function, and further discusses that the em algorithm can be used to compute the maximum likelihood estimators of the unknown parameters.
Hunt (1996) implemented the finite mixture model approach to clustering in a program called MULTIMIX. The program is designed to cluster multivariate data that have categorical and continuous variables and that possib...
详细信息
Hunt (1996) implemented the finite mixture model approach to clustering in a program called MULTIMIX. The program is designed to cluster multivariate data that have categorical and continuous variables and that possibly contain missing values. This paper describes the approach taken to design MULTIMIX and how some of the statistical problems were dealt with. As an example, the program is used to cluster a large medical dataset.
In this work, we study a class ofp-order non-negative integer-valued autoregressive (INAR(p)) processes, with innovations following zero-inflated (ZI) distributions called ZI-INAR(p) processes. Based on the em algorit...
详细信息
In this work, we study a class ofp-order non-negative integer-valued autoregressive (INAR(p)) processes, with innovations following zero-inflated (ZI) distributions called ZI-INAR(p) processes. Based on the em algorithm, we present an estimation procedure of parameters model. We also develop a regenerative bootstrap method to construct confidence intervals for the parameters as well as to estimate the forecasting distributions for future values. We discuss asymptotic properties of the regenerative bootstrap method. The performance of the proposed methods is evaluated considering the analysis of two simulation studies and a real dataset.
This article describes a Bayesian framework for estimation in item response models, with two-stage prior distributions on both item and examinee populations. Strategies for point and interval estimation are discussed,...
详细信息
This article describes a Bayesian framework for estimation in item response models, with two-stage prior distributions on both item and examinee populations. Strategies for point and interval estimation are discussed, and a general procedure based on the em algorithm is presented. Details are given for implementation under one-, two-, and three-parameter logistic IRT models. Novel features include minimally restrictive assumptions about examinee distributions and the exploitation of dependence among item parameters in a population of interest. Improved estimation in a moderately small sample is demonstrated with simulated data.
Proximal distance algorithms combine the classical penalty method of constrained minimization with distance majorization. If f(x) is the loss function, and C is the constraint set in a constrained minimization problem...
详细信息
Proximal distance algorithms combine the classical penalty method of constrained minimization with distance majorization. If f(x) is the loss function, and C is the constraint set in a constrained minimization problem, then the proximal distance principle mandates minimizing the penalized loss f(x) + ρ/2 dist(x, C)2 and following the solution xρ to its limit as ρ tends to ∞. At each iteration the squared Euclidean distance dist(x, C)2 is majorized by the spherical quadratic ǁx - PC(xk)ǁ2, where PC(xk) denotes the projection of the current iterate xk onto C. The minimum of the surrogate function f(x) + ρ/2 ǁx-PC(xk)ǁ2 is given by the proximal map proxρ-1f [PC(xk)]. The next iterate xk+1 automatically decreases the original penalized loss for fixed ρ. Since many explicit projections and proximal maps are known, it is straightforward to derive and implement novel optimization algorithms in this setting. These algorithms can take hundreds if not thousands of iterations to converge, but the simple nature of each iteration makes proximal distance algorithms competitive with traditional algorithms. For convex problems, proximal distance algorithms reduce to proximal gradient algorithms and therefore enjoy well understood convergence properties. For nonconvex problems, one can attack convergence by invoking Zangwill's theorem. Our numerical examples demonstrate the utility of proximal distance algorithms in various high-dimensional settings, including a) linear programming, b) constrained least squares, c) projection to the closest kinship matrix, d) projection onto a second-order cone constraint, e) calculation of Horn's copositive matrix index, f) linear complementarity programming, and g) sparse principal components analysis. The proximal distance algorithm in each case is competitive or superior in speed to traditional methods such as the interior point method and the alternating direction method of multipliers (ADMM). Source code for the numerical examples can be found
Middle censoring introduced by Jammalamadaka and Vasudevan (2003) refers to data arising in situations where the exact lifetime becomes unobservable if it falls within a random censoring interval, otherwise they becom...
详细信息
Middle censoring introduced by Jammalamadaka and Vasudevan (2003) refers to data arising in situations where the exact lifetime becomes unobservable if it falls within a random censoring interval, otherwise they become observable. In the present paper we propose a proportional hazards regression model for a lifetime data subject to middle censoring, where the lifetimes are assumed to follow Weibull distribution. The regression parameters are estimated using the em algorithm. Asymptotic normality of these estimators are established. We report simulation study for the model to assess the finite sample properties and analyse a real life data on survival times in months for Multiple Myeloma patients studied by Krall et al. (1975)
In this article, we aimed to calculate the value at risk (VaR), which is one of the financial risk calculation methods, by using mixture of two different distributions when the financial data does not fit the normal d...
详细信息
In this article, we aimed to calculate the value at risk (VaR), which is one of the financial risk calculation methods, by using mixture of two different distributions when the financial data does not fit the normal distribution. The normal-logDagum distribution consisting of mixture of Normal and log-Dagum distributions is proposed to calculate the VaR for non-normal financial data in the study. The expected-maximization (em) algorithm for the maximum likelihood estimates of the parameters of normal-logDagum was defined. In application, the stocks of bank and telecommunication companies were examined. VaR values obtained with different distributions are compared numerically. As a result of the comparison, it was seen that the modeling based on normal-logDagum distribution is more successful in the statistical modeling of financial data.
This report presents results on a parallel implementation of the expectation-maximization (em) algorithm for multidimensional latent variable models. The developments presented here are based on code that parallelizes...
详细信息
This report presents results on a parallel implementation of the expectation-maximization (em) algorithm for multidimensional latent variable models. The developments presented here are based on code that parallelizes both the E step and the M step of the parallel-E parallel-M algorithm. Examples presented in this report include item response theory, diagnostic classification models, multitrait–multimethod (MTMM) models, and discrete mixture distribution models. These types of models are frequently applied to the analysis of multidimensional responses of test takers to a set of items, for example, in the context of proficiency testing. The algorithm presented here is based on a direct implementation of massive parallelism using a paradigm that allows the distribution of work among a number of processor cores. Modern desktop computers as well as many laptops are using processors that contain 2–4 cores and potentially twice the number of virtual cores. Many servers use 2, 4, or more multicore #central processing units (CPUs), which brings the number of cores to 8, 12, 32, or even 64 or more. The algorithm presented here scales the time reduction in the most calculation-intense part of the program almost linearly for some problems, which means that a server with 32 physical cores executes the parallel-E step algorithm up to 24 times faster than a single-core computer or the equivalent nonparallel algorithm. The overall gain (including parts of the program that cannot be executed in parallel) can reach a reduction in time by a factor of 6 or more for a 12-core machine. The basic approach is to utilize the architecture of modern CPUs, which often involves the design of processors with multiple cores that can run programs simultaneously. The use of this type of architecture for algorithms that produce posterior moments has straightforward appeal: The calculations conducted for each respondent or each distinct response pattern can be split up into simultaneous calculations
Consider the problem of estimating a function of values of the variables z and y in a finite population, when values of y are known a priori for all units but values of z are not known for any. Data are then obtained ...
详细信息
Consider the problem of estimating a function of values of the variables z and y in a finite population, when values of y are known a priori for all units but values of z are not known for any. Data are then obtained from a sample of units; not in the form of values of z, however, but in the form of secondary variables x whose values depend stochastically on z. A framework for Bayesian analysis of T along the lines of Ericson (1969) is presented and illustrated with data from the Profile of American Youth (U.S. Department of Defense 1982).
暂无评论