correlatedbinary response data with covariates are ubiquitous in longitudinal or spatial studies. Among the existing statistical models, the most well-known one for this type of data is the multivariate probit model,...
详细信息
correlatedbinary response data with covariates are ubiquitous in longitudinal or spatial studies. Among the existing statistical models, the most well-known one for this type of data is the multivariate probit model, which uses a Gaussian link to model dependence at the latent level. However, a symmetric link may not be appropriate if the data are highly imbalanced. Here, we propose a multivariate skew-elliptical link model for correlatedbinary responses, which includes the multivariate probit model as a special case. Furthermore, we perform Bayesian inference for this new model and prove that the regression coefficients have a closed-form unified skew-elliptical posterior with an elliptical prior. The new methodology is illustrated by an application to COVID-19 data from three different counties of the state of California, USA. By jointly modeling extreme spikes in weekly new cases, our results show that the spatial dependence cannot be neglected. Furthermore, the results also show that the skewed latent structure of our proposed model improves the flexibility of the multivariate probit model and provides a better fit to our highly imbalanced dataset.
correlated binary data arise in a large variety of biomedical research. In order to evaluate methods for their analysis, computer simulations of such data are often required. Existing methods can often not cover the f...
详细信息
correlated binary data arise in a large variety of biomedical research. In order to evaluate methods for their analysis, computer simulations of such data are often required. Existing methods can often not cover the full range of possible correlations between the variables or are not available as implemented software. We propose a genetic algorithm that approaches the desired correlation structure under a given marginal distribution. The procedure generates a large representative matrix from which the probabilities of individual observations can be derived or from which samples can be drawn directly. Our genetic algorithm is evaluated under different specified marginal frequencies and correlation structures, and is compared against two existing approaches. The evaluation checks the speed and precision of the approach as well as its suitability for generating also high-dimensional data. In an example of high-throughput glycan array data, we demonstrate the usability of our approach to simulate the power of global test procedures. An implementation of our own and two other methods were added to the R package `RepeatedHighDim'. The presented algorithm is not restricted to certain correlation structures. In contrast to existing methods it is also evaluated for high-dimensional data.
Moment methods for analyzing repeated binary responses have been proposed by Liang & Zeger (1986), and extended by Prentice (1988). In their generalized estimating equations, both Liang & Zeger (1986) and Pren...
详细信息
Moment methods for analyzing repeated binary responses have been proposed by Liang & Zeger (1986), and extended by Prentice (1988). In their generalized estimating equations, both Liang & Zeger (1986) and Prentice (1988) estimate the parameters associated with the expected value of an individual's vector of binary responses as well as the correlations between pairs of binary responses. Because the odds ratio has many desirable properties, and some investigators may find the odds ratio is easier to interpret, we discuss modelling the association between binary responses at pairs of times with the odds ratio. We then modify the estimating equations of Prentice to estimate the odds ratios. In simulations, the parameter estimates for the logistic regression model for the marginal probabilities appear slightly more efficient when using the odds ratio parameterization.
The paper considers comparative studies in which subjects use more than one product or receive more than one treatment. The paper is focused mainly on the comparison of products, including the possibility of a large n...
详细信息
The paper considers comparative studies in which subjects use more than one product or receive more than one treatment. The paper is focused mainly on the comparison of products, including the possibility of a large number of products. The data to be analysed are on a binary variable that is observed by each subject for each product used. The calculation of the number of subjects or sample size is illustrated for a variety of basic study designs, assuming that the data may be correlated and the analysis of the data uses generalized estimating equations methodology. The sample sizes use asymptotic theory developed by Liu and Liang. A simulation study that evaluates some of the resulting sample sizes suggests that these need to be increased slightly to achieve the nominal power.
For analyzing correlated binary data with high-dimensional covariates,we,in this paper,propose a two-stage shrinkage ***,we construct a weighted least-squares(WLS) type function using a special weighting scheme on the...
详细信息
For analyzing correlated binary data with high-dimensional covariates,we,in this paper,propose a two-stage shrinkage ***,we construct a weighted least-squares(WLS) type function using a special weighting scheme on the non-conservative vector field of the generalized estimating equations(GEE) ***,we define a penalized WLS in the spirit of the adaptive LASSO for simultaneous variable selection and parameter *** proposed procedure enjoys the oracle properties in high-dimensional framework where the number of parameters grows to infinity with the number of ***,we prove the consistency of the sandwich formula of the covariance matrix even when the working correlation matrix is *** the selection of tuning parameter,we develop a consistent penalized quadratic form(PQF) function *** performance of the proposed method is assessed through a comparison with the existing methods and through an application to a crossover trial in a pain relief study.
data augmentation has been commonly utilized to analyze correlated binary data using multivariate probit models in Bayesian analysis. However, the identification issue in the multivariate probit models necessitates a ...
详细信息
data augmentation has been commonly utilized to analyze correlated binary data using multivariate probit models in Bayesian analysis. However, the identification issue in the multivariate probit models necessitates a rigorous Metropolis-Hastings algorithm for sampling a correlation matrix, which may cause slow convergence and inefficiency of Markov chains. It is well-known that the parameter-expanded data augmentation, by introducing a working/artificial parameter or parameter vector, makes an identifiable model be non-identifiable and improves the mixing and convergence of data augmentation components. Therefore, we motivate to develop efficient parameter-expanded data augmentations to analyze correlated binary data using multivariate probit models. We investigate both the identifiable and non-identifiable multivariate probit models and develop the corresponding parameter-expanded data augmentation algorithms. We point out that the approaches, based on one non-identifiable model, circumvent a Metropolis-Hastings algorithm for sampling a correlation matrix and improve the convergence and mixing of correlation parameters;the identifiable model may produce the estimated regression parameters with smaller standard errors than the non-identifiable model does. We illustrate our proposed approaches using simulation studies and through the application to a longitudinal dataset from the Six Cities study.
Confidence interval (CI) methods for the ratio of two proportions in the presence of correlated bilateral binarydata are constructed for comparative clinical trials with stratified design. Simulations are conducted t...
详细信息
Confidence interval (CI) methods for the ratio of two proportions in the presence of correlated bilateral binarydata are constructed for comparative clinical trials with stratified design. Simulations are conducted to evaluate the performance of the presented CIs with respect to mean coverage probability (MCP), mean interval width (MIW), and the ratio of mesial non-coverage probability to the distal non-coverage probability (RMNCP). Based on the empirical results, we suggest the use of the proposed CI method based on the complete score statistics (CS) for general applications. An example from a rheumatology study is used to demonstrate the proposed methodologies.
In this paper, three analysis procedures for repeated correlated binary data with no a priori ordering of the measurements are described and subsequently investigated. Examples for correlated binary data could be the ...
详细信息
In this paper, three analysis procedures for repeated correlated binary data with no a priori ordering of the measurements are described and subsequently investigated. Examples for correlated binary data could be the binary assessments of subjects obtained by several raters in the framework of a clinical trial. This topic is especially of relevance when success criteria have to be defined for dedicated imaging trials involving several raters conducted for regulatory purposes. First, an analytical result on the expectation of the Majority rater' is presented when only the marginal distributions of the single raters are given. The paper provides a simulation study where all three analysis procedures are compared for a particular setting. It turns out that in many cases, Average rater' is associated with a gain in power. Settings were identified where Majority significant' has favorable properties. Majority rater' is in many cases difficult to interpret. Copyright (c) 2014 John Wiley & Sons, Ltd.
A critical issue in modeling binary response data is the choice of the links. We introduce a new link based on the Student's t-distribution (t-link) for correlated binary data. The t-link relates to the common pro...
详细信息
A critical issue in modeling binary response data is the choice of the links. We introduce a new link based on the Student's t-distribution (t-link) for correlated binary data. The t-link relates to the common probit-normal link adding one additional parameter which controls the heaviness of the tails of the link. We propose an interesting EM algorithm for computing the maximum likelihood for generalized linear mixed t-link models for correlated binary data. In contrast with recent developments (Tan et al. in J. Stat. Comput. Simul. 77:929-943, 2007;Meza et al. in Comput. Stat. data Anal. 53:1350-1360, 2009), this algorithm uses closed-form expressions at the E-step, as opposed to Monte Carlo simulation. Our proposed algorithm relies on available formulas for the mean and variance of a truncated multivariate t-distribution. To illustrate the new method, a real data set on respiratory infection in children and a simulation study are presented.
binary responses are correlated when the sampling units are clustered or when repeated binary responses are taken on the same experiment unit. In this paper we present a Bayesian analysis of logistic regression models...
详细信息
binary responses are correlated when the sampling units are clustered or when repeated binary responses are taken on the same experiment unit. In this paper we present a Bayesian analysis of logistic regression models for correlated binary data with random effects. We assume that the random effects, namely alpha(i), i = 1,..., n are draw from a mixture of normal distributions. This assumption gives a great flexibility of fit by correlated binary data. Considering Gibbs sampling with Metropolis-Hastings algorithms, we obtain Monte Carlo estimates for the posterior quantities of interest.
暂无评论