The expectation-maximization (em) algorithm is a robust method for maximum likelihood estimation of the parameters of an incompletely sampled distribution. It has been used to resolve the trial-to-trial amplitude fluc...
详细信息
The expectation-maximization (em) algorithm is a robust method for maximum likelihood estimation of the parameters of an incompletely sampled distribution. It has been used to resolve the trial-to-trial amplitude fluctuations of postsynaptic potentials, when these are recorded in the presence of noise. Its use has however been limited by the need for different recursion equations for each set of conditions defined by the signal and noise processes. These questions are derived for the following conditions which arise in studies of synaptic transmission: non-gaussian noise process;quantal fluctuation;quantral variability. In addition, constraint can be incorporated to accomodate simple and compound binomial models of transmitter release. Some advantages of these methods are illustrated by Monte Carlo simulations.
This article considers Bayesian estimation methods for categorical data with misclassifications. To adjust for misclassification, double sampling schemes are utilized. Observations are represented in a contingency tab...
详细信息
This article considers Bayesian estimation methods for categorical data with misclassifications. To adjust for misclassification, double sampling schemes are utilized. Observations are represented in a contingency table categorized by error-free categorical variables and error-prone categorical variables. Posterior means of probabilities in cells are considered as estimates. In some cases, the posterior means can be calculated exactly. However,in some cases, the exact calculation may be too difficult to perform, but we can easily use the expectation-maximiza-tion(em) algorithm to obtain approximate posterior means.
Bayesian methods are suggested for estimating proportions in the cells of cross-classification tables having at least one classification with ordered categories. These methods utilize models for cell proportions that ...
详细信息
Bayesian methods are suggested for estimating proportions in the cells of cross-classification tables having at least one classification with ordered categories. These methods utilize models for cell proportions that incorporate the category orderings. The resulting estimators are smoother and can be much more efficient that the sample proportions, yet they are consistent even if the model chosen for the smoothing does not hold. Two approaches are considered: (1) Bayes estimators using a Dirichlet prior distribution for the proportions: (2) Bayes estimators based on normal prior distributions for association parameters in the saturated loglinear model. In each case, the means of the prior distributions are chosen to satisfy a model for ordered categorical data, such as the uniform association model. empirical Bayes versions of the two analyses are also given.
The reporting procedures for potentially toxic pollutants are complicated by the fact that concentrations are measured using small samples that include a number of observations lying below some detection limit. Furthe...
详细信息
The reporting procedures for potentially toxic pollutants are complicated by the fact that concentrations are measured using small samples that include a number of observations lying below some detection limit. Furthermore, there is often a small number of high concentrations observed in combination with a substantial number of low concentrations. This results in small, nonnormally distributed censored samples. This article presents maximum likelihood estimators for the mean of a population, based on censored samples that can be transformed to normality. The method estimates the optimal power transformation in the Box-Cox family by searching the censored-data likelihood. Maximum likelihood estimators for the mean in the transformed scale are calculated via the expectation-maximization algorithm. Estimates for the mean in the original scale are functions of the estimated mean and variance in the transformed population. Confidence intervals are computed using the delta method and the nonparametric percentile and bias-corrected percentile versions of Efron's bootstrap. A simulation study over sampling configurations expected with environmental data indicates that the delta method, combined with a reliable value for the power transformation, produces intervals with better coverage properties than the bootstrap intervals.
An old problem in personnel psychology is to characterize distributions of test validity correlation coefficients. The proposed model views histograms of correlation coefficients as observations from a mixture distrib...
详细信息
An old problem in personnel psychology is to characterize distributions of test validity correlation coefficients. The proposed model views histograms of correlation coefficients as observations from a mixture distribution which, for a fixed sample sizen, is a conditional mixture distributionh(r|n) = Σ j λ j h(r; ρ j ,n), whereR is the correlation coefficient, ρ j are population correlation coefficients and λ j are the mixing weights. The associated marginal distribution ofR is regarded as the parent distribution underlying histograms of empirical correlation coefficients. Maximum likelihood estimates of the parameters ρ j and λ j can be obtained with an em algorithm solution and tests for the number of componentst are achieved after the (one-component) density ofR is replaced with a tractable modeling densityh(r; ρ j ,n). Two illustrative examples are provided.
For multiple populatios, a longtidinal factor analytic model which is entirely exploratory, that is, no explicit identification constraints, is proposed. Factorial collapse and period/practice effects are allowed. An ...
详细信息
For multiple populatios, a longtidinal factor analytic model which is entirely exploratory, that is, no explicit identification constraints, is proposed. Factorial collapse and period/practice effects are allowed. An invariant and/or stationary factor pattern is permitted. This model is formulated stochastically. To implement this model a stagewise em algorithm is developed. Finally a numerical illustration utilizing Nesselroade and Baltes' data is presented.
Generalized linear modeling techniques and resistant fitting methods were used to analyze the results of two experiments to test for density-dependent viability selection on chromosomal variants inDrosophila pseudoobs...
详细信息
Generalized linear modeling techniques and resistant fitting methods were used to analyze the results of two experiments to test for density-dependent viability selection on chromosomal variants inDrosophila pseudoobscurafruit flies. In the framework of the generalized linear model, a series of nested hypotheses were fitted by the maximum likelihood method. The hypotheses describe how the viabilities of two genotypes, relative to a third, varied with density and other factors. An additional analysis of the absolute viabilities of the three genotypes was also performed by the maximum likelihood method; estimates were computed by the em algorithm. Resistant methods of fitting were then performed for the best-fitting relative viability models. The analyses confirm that the relative viability of one genotype increased linearly with density (with slopeand intercept). Viability differences between genotypes can thus be explained by variation in density.
A test census of Tampa, Florida and an independent postenumeration survey (PES) were conducted by the U.S. Census Bureau in 1985. The PES was a stratified block sample with heavy emphasis placed on hard-to-count popul...
详细信息
A test census of Tampa, Florida and an independent postenumeration survey (PES) were conducted by the U.S. Census Bureau in 1985. The PES was a stratified block sample with heavy emphasis placed on hard-to-count population groups. Matching the individuals in the census to the individuals in the PES is an important aspect of census coverage evaluation and consequently a very important process for any census adjustment operations that might be planned. For such an adjustment to be feasible, record-linkage software had to be developed that could perform matches with a high degree of accuracy and that was based on an underlying mathematical theory. A principal purpose of the PES was to provide an opportunity to evaluate the newly implemented record-linkage system and associated methodology. This article discusses the theoretical and practical issues encountered in conducting the matching operation and presents the results of that operation. A review of the theoretical background of the record-linkage problem provides a framework for discussions of the decision procedure, file blocking, and the independence assumption. The estimation of the parameters required by the decision procedure is an important aspect of the methodology, and the techniques presented provide a practical system that is easily implemented. The matching algorithm (discussed in detail) uses the linear sum assignment model to 'pair' the records. The Tampa, Florida, matching methodology is described in the final sections of the article. Included in the discussion are the results of the matching itself, an independent clerical review of the matches and nonmatches, conclusions, problem areas, and future work required. [ABSTRACT FROM AUTHOR]
Longitudinal studies often involve the repeated diagnosis across time of each patient's status with respect to a progressive categorical process. When the occurrence of a change m status is not readily apparent, t...
详细信息
Longitudinal studies often involve the repeated diagnosis across time of each patient's status with respect to a progressive categorical process. When the occurrence of a change m status is not readily apparent, two factors can make modeling and assessing the incidence rates of progression difficult. First, because diagnoses may be difficult, they may not be performed with the frequency necessary to pinpoint exact times of incidence. Second, uncertainty in the diagnostic process can obscure identification of the time interval in which incidence occurs. When serial diagnoses are fallible, even small error rates can seriously disrupt interpretation and make using the aforementioned methods difficult or impossible. For example, if false diagnoses (both false positives and negatives) occur independently with probability .05 in a longitudinal study involving four serial diagnoses, 19% of the strings of serial diagnoses would be expected to contain at least one error. If the underlying process is progressive, many of these errors would be noticeable: At face value, some patterns of diagnoses would describe regressions. Errors yielding patterns of diagnoses that arc progressive would not be detectable. Simply omitting any subjects with inconsistent patterns from the analysis introduces bias. Another possible approach, using the first reported incidence of progression, also introduces bias (Schlesselman 1977). To analyze clinical data on the diagnosis of sexual maturation among subjects with sickle-cell disease, models are developed for jointly parameterizing incidence and error rates. An em algorithm is presented that allows tractable maximum likelihood estimation even when the times of diagnoses are irregular and vary among subjects. Likelihood ratio tests are used to assess relationships between categorical covariates and both incidence and error rates. Data from the Cooperative Study of Sickle Cell Disease are analyzed to describe the age distribution for the onset of [
Thetdistribution provides a useful extension of the normal for statistical modeling of data sets involving errors with longer-than-normal tails. An analytical strategy based on maximum likelihood for a general model w...
详细信息
Thetdistribution provides a useful extension of the normal for statistical modeling of data sets involving errors with longer-than-normal tails. An analytical strategy based on maximum likelihood for a general model with multivariateterrors is suggested and applied to a variety of problems, including linear and nonlinear regression, robust estimation of the mean and covariance matrix with missing data, unbalanced multivariate repeated-measures data, multivariate modeling of pedigree data, and multivariate nonlinear regression. The degrees of freedom parameter of thetdistribution provides a convenient dimension for achieving robust statistical inference, with moderate increases in computational complexity for many models. Estimation of precision from asymptotic theory and the bootstrap is discussed, and graphical methods for checking the appropriateness of thetdistribution are presented.
暂无评论