In the classical growth curve setting, individuals are repeatedly measured over time on an outcome of interest. The objective of statistical modeling is to fit some function of time, generally a polynomial, that descr...
详细信息
In the classical growth curve setting, individuals are repeatedly measured over time on an outcome of interest. The objective of statistical modeling is to fit some function of time, generally a polynomial, that describes the outcome's behavior. The polynomial coefficients are assumed drawn from a multivariate normal mixing distribution. At times, it may be known that each individual's polynomial must follow a restricted form. When the polynomial coefficients lie close to the restriction boundary, or the outcome is subject to substantial measurement error, or relatively few observations per individual are recorded, it can be advantageous to incorporate known restrictions. This paper introduces a class of models where the polynomial coefficients are assumed drawn from a restricted multivariate normal whose support is confined to a theoretically permissible region. The model can handle a variety of restrictions on the space of random parameters. The restricted support ensures that each individual's random polynomial is theoretically plausible. Estimation, posterior calculations, and comparisons with the unrestricted approach are provided.
We examined the impact of different methods for replacing missing data in discriminant analyses conducted on randomly generated samples from multivariate normal and non-normal distributions. The probabilities of corre...
详细信息
We examined the impact of different methods for replacing missing data in discriminant analyses conducted on randomly generated samples from multivariate normal and non-normal distributions. The probabilities of correct classification were obtained for these discriminant analyses before and after randomly deleting data as well as after deleted data were replaced using: (1) variable means, (2) principal component projections, and (3) the em algorithm. Populations compared were: (1) multivariate normal with covariance matrices SIGMA-1 = SIGMA-2, (2) multivariate normal with SIGMA-1 not-equal SIGMA-2, and (3) multivariate non-normal with SIGMA-1 = SIGMA-2. Differences in the probabilities of correct classification were most evident for populations with small Mahalanobis distances or high proportions of missing data. The three replacement methods performed similarly but all were better than non-replacement.
Luce's Biased Choice Model has never had a serious competitor as a model of identification data. Even when it has provided a poor model of such data, other models have done even less well. Two alternative models a...
详细信息
Luce's Biased Choice Model has never had a serious competitor as a model of identification data. Even when it has provided a poor model of such data, other models have done even less well. Two alternative models are presented and the three are fit to a published data set. One alternative model is very much like the Biased Choice Model, differing only in the way it treats response bias. It uses an ordinal assumption about the biases and might be called the Triangular Bias (TB) model. The Guessing Mixture Model (GMM) is quite different, although it too uses the concepts of bias and similarity. It posits that the observed confusion matrix is a probability mixture of two latent matrices, the one involving only similarity, not bias, while the other involves bias, not similarity. Illustrative data, a confusion matrix based on four stimuli constructed by crossing two binary features, can be naturally described in three hierarchical ways. The most general description ignores the feature structure of the stimuli. The next description, the feature pattern model, assumes that similarity depends only on the pattern of feature differences, and the simplest special case assumes that similarity depends only on the product of similarities from each of the features. For the general description the three models are not strikingly different, with the Biased Choice Model fitting least well, followed by GMM, with TB the winner. For the independent feature form, however, the GMM model fits much better than either of the others. Indeed, the independent feature model cannot be rejected at the 10% level using GMM, even though the sample of data is large.
A plausible s-factor solution for many types of psychological and educational tests is one that exhibits a general factor and s - 1 group or method related factors. The bi-factor solution results from the constraint t...
详细信息
A plausible s-factor solution for many types of psychological and educational tests is one that exhibits a general factor and s - 1 group or method related factors. The bi-factor solution results from the constraint that each item has a nonzero loading on the primary dimension and at most one of the s - 1 group factors. This paper derives a bi-factor item-response model for binary response data. In marginal maximum likelihood estimation of item parameters, the bi-factor restriction leads to a major simplification of likelihood equations and (a) permits analysis of models with large numbers of group factors;(b) permits conditional dependence within identified subsets of items;and (c) provides more parsimonious factor solutions than an unrestricted full-information item factor analysis in some cases.
A principal curve (Hastie and Stuetzle, 1989) is a smooth curve passing through the 'middle' of a distribution or data cloud, and is a generalization of linear principal components. We give an alternative defi...
详细信息
A sample is commonly modeled by a mixture distribution if the observations follow a common distribution, but the parameter of interest differs between observations. For example, we observe the lengths but not the ages...
详细信息
A sample is commonly modeled by a mixture distribution if the observations follow a common distribution, but the parameter of interest differs between observations. For example, we observe the lengths but not the ages of a sample offish. It may be reasonable to assume that length is normally distributed about an unknown mean that depends on the age of the fish. Provided there is more than one age class in the sample, then the data are distributed as a mixture of normals. In this article we assume that the data are a random sample from a mixture of exponential family distributions and that for each observation the parameter of interest is sampled independently from an unknown mixing distributionQ. The adequacy of a fitted mixture model can be assessed by examining residuals based on the ratio of the observed to expected fit. Residuals based on the homogeneity model (in whichQis a one-point distribution) display a convexity property when the data follow a mixture model; this becomes the basis for diagnostic plots to detect the presence of mixing. Similar results also are obtained from smoothed residuals; thus the diagnostic also can be applied to sparse or continuous data. The nonparametric maximum likelihood estimate[Qcirc]of the distributionQis known to be discrete. Smoothed residuals obtained from the fitted mixed model provide information about the number of support points in[Qcirc]. This facilitates the use of the em algorithm to find[Qcirc]. The residuals evaluated at[Qcirc]determine whether or not the maximum likelihood estimate is unique and hence interpretable. Simulated and actual data sets are analyzed to illustrate the power and the utility of these procedures.
Approximate maximum-likelihood estimates for locating of wide band sources in the presence of partly unknown noise fields are developed. Alternatively, two least squares methods fitting a parametric model of the spect...
详细信息
Approximate maximum-likelihood estimates for locating of wide band sources in the presence of partly unknown noise fields are developed. Alternatively, two least squares methods fitting a parametric model of the spectral density matrix to a corresponding nonparametric estimate are investigated. Furthermore, applying the expectation maximization (em) algorithm, a computationally robust iteration scheme for maximizing the log-likelihood function is derived. Using wide band data from a North Sea experiment, we compare the performance of the maximum likelihood method via the em algorithm with the MUSIC algorithm combined with the rotational signal-subspace (RSS) focussing technique.
For latent class analysis, a widely known statistical method for the unmixing of an observed frequency table into several unobservable ones, a flexible model is presented in order to restrain the unknown class sizes (...
详细信息
For latent class analysis, a widely known statistical method for the unmixing of an observed frequency table into several unobservable ones, a flexible model is presented in order to restrain the unknown class sizes (mixing weights) and the unknown latent response probabilities. Two systems of basic equations are stated such that they simultaneously allow parameter fixations, the equality of certain parameters as well as linear logistic constraints of each of the original parameters. The maximum likelihood equations for the parameters of this 'linear logistic latent class analysis' are given, and their estimation by means of the em algorithm is described. Further, the criteria for their local identifiability and statistical tests (Pearson- and likelihood-ratio-χ2) for goodness of fit are outlined. The practical applicability of linear logistic latent class analysis is demonstrated by three examples: mixed logistic regression, a mixed Bradley–Terry model for paired comparisons with ties, and a local dependence latent class model in which the departure from stochastic independence is covered by a single additional parameter per class. [ABSTRACT FROM AUTHOR]
暂无评论