This paper investigates estimating and testing treatment effects in randomized control trials where imperfect diagnostic device is used to assign subjects to treatment and control group(s). The paper focuses on pre-po...
详细信息
This paper investigates estimating and testing treatment effects in randomized control trials where imperfect diagnostic device is used to assign subjects to treatment and control group(s). The paper focuses on pre-post design and proposes two new methods for estimating and testing treatment effects. Furthermore, methods for computing sample sizes for such design accounting for misclassification of the subjects are devised. The methods are compared with each other and with a traditional method that ignores the imperfection of the diagnostic device. In particular, the likelihood-based approach shows a significant advantage in terms of power, coverage probability and, consequently, in reduction of the required sample size. The application of the results are illustrated with data from an aging trial for dementia and data from electroencephalogram (EEG) recordings of alcoholic and non-alcoholic subjects. Copyright (C) 2016 John Wiley & Sons, Ltd.
Multivariate mixtures of Erlang distributions form a versatile, yet analytically tractable, class of distributions making them suitable for multivariate density estimation. We present a flexible and effective fitting ...
详细信息
Multivariate mixtures of Erlang distributions form a versatile, yet analytically tractable, class of distributions making them suitable for multivariate density estimation. We present a flexible and effective fitting procedure for multivariate mixtures of Erlangs, which iteratively uses the EM algorithm, by introducing a computationally efficient initialization and adjustment strategy for the shape parameter vectors. We furthermore extend the EMalgorithm for multivariatemixtures of Erlangs to be able to deal with randomly censored and fixed truncated data. The effectiveness of the proposed algorithm is demonstrated on simulated as well as real data sets.
Joint latent class modeling of disease prevalence and high-dimensional semicontinuous biomarker data has been proposed to study the relationship between diseases and their related biomarkers. However, statistical infe...
详细信息
Joint latent class modeling of disease prevalence and high-dimensional semicontinuous biomarker data has been proposed to study the relationship between diseases and their related biomarkers. However, statistical inference of the joint latent class modeling approach has proved very challenging due to its computational complexity in seeking maximum likelihood estimates. In this article, we propose a series of composite likelihoods for maximum composite likelihood estimation, as well as an enhanced Monte Carlo expectation-maximization (MCEM) algorithm for maximum likelihood estimation, in the context of joint latent class models. Theoretically, the maximum composite likelihood estimates are consistent and asymptotically normal. Numerically, we have shown that, as compared to the MCEM algorithm that maximizes the full likelihood, not only the composite likelihood approach that is coupled with the quasi-Newton method can substantially reduce the computational complexity and duration, but it can simultaneously retain comparative estimation efficiency.
Group testing, introduced by Dorfman (1943), has been used to reduce costs when estimating the prevalence of a binary characteristic based on a screening test of k groups that include n independent individuals in tota...
详细信息
Group testing, introduced by Dorfman (1943), has been used to reduce costs when estimating the prevalence of a binary characteristic based on a screening test of k groups that include n independent individuals in total. If the unknown prevalence is low and the screening test suffers from misclassification, it is also possible to obtain more precise prevalence estimates than those obtained from testing all n samples separately (Tu et al., 1994). In some applications, the individual binary response corresponds to whether an underlying time-to-event variable T is less than an observed screening time C, a data structure known as current status data. Given sufficient variation in the observed C values, it is possible to estimate the distribution function F of T nonparametrically, at least at some points in its support, using the pool-adjacent-violators algorithm (Ayer et al., 1955). Here, we consider nonparametric estimation of F based on group-tested current status data for groups of size k where the group tests positive if and only if any individual's unobserved T is less than the corresponding observed C. We investigate the performance of the group-based estimator as compared to the individual test nonparametric maximum likelihood estimator, and show that the former can be more precise in the presence of misclassification for low values of F(t). Potential applications include testing for the presence of various diseases in pooled samples where interest focuses on the age-at-incidence distribution rather than overall prevalence. We apply this estimator to the age-at-incidence curve for hepatitis C infection in a sample of U.S. women who gave birth to a child in 2014, where group assignment is done at random and based on maternal age. We discuss connections to other work in the literature, as well as potential extensions.
Because of the Student-t distribution owning heavier tailed than the Gaussian distribution, under a Bayesian framework, a spatially variant finite mixture model with Student's t-distribution component function is ...
详细信息
Because of the Student-t distribution owning heavier tailed than the Gaussian distribution, under a Bayesian framework, a spatially variant finite mixture model with Student's t-distribution component function is proposed for grayscale image segmentation. To avoid additional computational step and improve the efficiency of the proposed model, a representation of contextual mixing proportion is adopted. Secondly, the spatial information of the pixels is closely related to the Gaussian distribution of their neighborhood system. Thirdly, the inherent relationship between the Gaussian distribution and the Student's t-distribution is adopted to optimize the unknown parameters of the proposed model, which simplifies the inference process and makes the proposed model to be easily implemented. Comprehensive experiments on synthetic noise images, simulated medical images and real-world grayscale images are presented to illustrate the superior performance of the proposed model in terms of the visual and quantitative comparison. (C) 2016 Elsevier Inc. All rights reserved.
We study nonparametric maximum likelihood estimation for the distribution of spherical radii using samples containing a mixture of one-dimensional, two-dimensional biased and three-dimensional unbiased observations. S...
详细信息
We study nonparametric maximum likelihood estimation for the distribution of spherical radii using samples containing a mixture of one-dimensional, two-dimensional biased and three-dimensional unbiased observations. Since direct maximization of the likelihood function is intractable, we propose an expectation-maximization algorithm for implementing the estimator, which handles an indirect measurement problem and a sampling bias problem separately in the E- and M-steps, and circumvents the need to solve an Abel-type integral equation, which creates numerical instability in the one-sample problem. Extensions to ellipsoids are studied and connections to multiplicative censoring are discussed.
The marker-stratified design (MSD) is an important design to assess treatment and marker effects in personalized medicine. The MSD stratifies patients into marker positive and marker negative subgroups on the basis of...
详细信息
The marker-stratified design (MSD) is an important design to assess treatment and marker effects in personalized medicine. The MSD stratifies patients into marker positive and marker negative subgroups on the basis of their biomarker profiles and then randomizes them to the standard treatment or a new treatment within each subgroup. The performance of the MSD can be seriously undermined when the biomarker is measured with error (or misclassified). A recently proposed analytic method corrects the biomarker misclassification in the MSD under the assumptions that the biomarker classification rates are known and no other covariates need to be adjusted. We propose a two-stage MSD to relax these assumptions. We analytically investigate the bias in the estimation of prognostic and predictive marker effects and treatment effects caused by biomarker misclassification in the presence of covariates, and we propose an expectation-maximization algorithm to correct such biases. The design does not require prespecification of the misclassification rates and can incorporate any covariates that potentially confound the prognostic and predictive marker effects and treatment effect. Numerical trial applications show that the method has desirable operating characteristics.
We propose taking advantage of methodology for missing data to estimate relationships and adjust outcomes in a meta-analysis where a continuous covariate is differentially categorized across studies. The proposed meth...
详细信息
We propose taking advantage of methodology for missing data to estimate relationships and adjust outcomes in a meta-analysis where a continuous covariate is differentially categorized across studies. The proposed method incorporates all available data in an implementation of the expectation-maximization algorithm. We use simulations to demonstrate that the proposed method eliminates bias that would arise by ignoring a covariate and generalizes the meta-analytical approach for incorporating covariates that are not uniformly categorized. The proposed method is illustrated in an application for estimating diarrhea incidence in children aged a parts per thousand currency sign59 months.
Methods of estimating allele frequencies from data on unrelated and related individuals are described in this chapter. For samples of unrelated individuals with simple codominant markers, the natural estimators of all...
详细信息
Methods of estimating allele frequencies from data on unrelated and related individuals are described in this chapter. For samples of unrelated individuals with simple codominant markers, the natural estimators of allele frequencies can be used. For genetic data on related individuals, maximum likelihood estimation (MLE) can be applied to compute allele frequencies. Factors that influence allele frequencies in populations are also explained. less
We extend to the longitudinal setting a latent class approach that was recently introduced by Lanza, Coffman, and Xu to estimate the causal effect of a treatment. The proposed approach enables an evaluation of multipl...
详细信息
We extend to the longitudinal setting a latent class approach that was recently introduced by Lanza, Coffman, and Xu to estimate the causal effect of a treatment. The proposed approach enables an evaluation of multiple treatment effects on subpopulations of individuals from a dynamic perspective, as it relies on a latent Markov (LM) model that is estimated taking into account propensity score weights based on individual pretreatment covariates. These weights are involved in the expression of the likelihood function of the LM model and allow us to balance the groups receiving different treatments. This likelihood function is maximized through a modified version of the traditional expectation-maximization algorithm, while standard errors for the parameter estimates are obtained by a nonparametric bootstrap method. We study in detail the asymptotic properties of the causal effect estimator based on the maximization of this likelihood function, and we illustrate its finite sample properties through a series of simulations showing that the estimator has the expected behavior. As an illustration, we consider an application aimed at assessing the relative effectiveness of certain degree programs on the basis of three ordinal response variables in which the work path of a graduate is considered as the manifestation of his or her human capital-level across time.
暂无评论