Multivariate outcomes are often measured longitudinally. For example, in hearing loss studies, hearing thresholds for each subject are measured repeatedly over time at several frequencies. Thus, each patient is associ...
详细信息
Multivariate outcomes are often measured longitudinally. For example, in hearing loss studies, hearing thresholds for each subject are measured repeatedly over time at several frequencies. Thus, each patient is associated with a multivariate longitudinal outcome. The multivariate mixed-effects model is a useful tool for the analysis of such data. There are situations in which the parameters of the model are subject to some restrictions or constraints. For example, it is known that hearing thresholds, at every frequency, increase with age. Moreover, this age-related threshold elevation is monotone in frequency, that is, the higher the frequency, the higher, on average, is the rate of threshold elevation. This means that there is a natural ordering among the different frequencies in the rate of hearing loss. In practice, this amounts to imposing a set of constraints on the different frequencies regression coefficients modeling the mean effect of time and age at entry to the study on hearing thresholds. The aforementioned constraints should be accounted for in the analysis. The result is a multivariate longitudinal model with restricted parameters. We propose estimation and testing procedures for such models. We show that ignoring the constraints may lead to misleading inferences regarding the direction and the magnitude of various effects. Moreover, simulations show that incorporating the constraints substantially improves the mean squared error of the estimates and the power of the tests. We used this methodology to analyze a real hearing loss study. Copyright (C) 2012 John Wiley & Sons, Ltd.
In this paper, we analyze a mixture of Lognormal and Log-Logistic distribution. We estimate the parameters of the introduced distribution by using the expectation-maximization (EM) algorithm. Various phenomena in the ...
详细信息
In this paper, we analyze a mixture of Lognormal and Log-Logistic distribution. We estimate the parameters of the introduced distribution by using the expectation-maximization (EM) algorithm. Various phenomena in the field of medicine and economy could be modeled by this mixture. In this paper, it is used to construct new mortality model for determining the unisex premium rates in life insurance. The application of the model is illustrated in the case of Serbian population and its advantages are presented in the context of life insurance premium calculation.
In applications such as carrier attitude control and mobile device navigation, a micro-electro-mechanical-system (MEMS) gyroscope will inevitably be affected by random vibration, which significantly affects the perfor...
详细信息
In applications such as carrier attitude control and mobile device navigation, a micro-electro-mechanical-system (MEMS) gyroscope will inevitably be affected by random vibration, which significantly affects the performance of the MEMS gyroscope. In order to solve the degradation of MEMS gyroscope performance in random vibration environments, in this paper, a combined method of a long short-term memory (LSTM) network and Kalman filter (KF) is proposed for error compensation, where Kalman filter parameters are iteratively optimized using the Kalman smoother and expectation-maximization (EM) algorithm. In order to verify the effectiveness of the proposed method, we performed a linear random vibration test to acquire MEMS gyroscope data. Subsequently, an analysis of the effects of input data step size and network topology on gyroscope error compensation performance is presented. Furthermore, the autoregressive moving average-Kalman filter (ARMA-KF) model, which is commonly used in gyroscope error compensation, was also combined with the LSTM network as a comparison method. The results show that, for the x-axis data, the proposed combined method reduces the standard deviation (STD) by 51.58% and 31.92% compared to the bidirectional LSTM (BiLSTM) network, and EM-KF method, respectively. For the z-axis data, the proposed combined method reduces the standard deviation by 29.19% and 12.75% compared to the BiLSTM network and EM-KF method, respectively. Furthermore, for x-axis data and z-axis data, the proposed combined method reduces the standard deviation by 46.54% and 22.30% compared to the BiLSTM-ARMA-KF method, respectively, and the output is smoother, proving the effectiveness of the proposed method.
Haplotype analyses are an important area in the study of the genetic components of human disease. Associations between markers and disease loci that are not evident with a single marker locus may be identified in mult...
详细信息
Haplotype analyses are an important area in the study of the genetic components of human disease. Associations between markers and disease loci that are not evident with a single marker locus may be identified in multi-locus marker analyses using estimated haplotype frequencies (HFs). Procedures that make use of the expectation-maximization (EM) algorithm to estimate HFs from unphased genotype data are in common use in genetic studies. The EM algorithm uses these unphased genotype frequencies along with the assumption of Hardy-Weinberg proportions (HWP) to converge on HF estimates. In this paper, we assess the accuracy of EM estimates of HFs in patients with type I diabetes for whom the true haplotypes are known, but the data are analyzed ignoring family information to allow comparison between estimated and true frequencies. The data consist of six HLA loci with high levels of polymorphism and a range of departures from HWP and linkage equilibrium. While the overall accuracy of the EM estimates is good, there can be large over- and underestimates of particular HFs, even for common haplotypes, especially when the loci involved deviate significantly from HWP. Estimating HFs for three or more loci and then collapsing over loci so as to generate two locus haplotypes can improve the accuracy of the estimation. The collapsing procedure is most beneficial when one of the loci in the two-locus haplotype of interest deviates significantly front HWP and the locus collapsed over is in linkage disequilibrium with the other loci. Genet. Epidemiol. 22:186195,2002. (C) 2002 Wiley-Liss, Inc.
Generalized linear models are addressed to describe the dependence of data on explanatory variables when the binary outcome is subject to misclassification. Both probit and t-link regressions for misclassified binary ...
详细信息
Generalized linear models are addressed to describe the dependence of data on explanatory variables when the binary outcome is subject to misclassification. Both probit and t-link regressions for misclassified binary data under Bayesian methodology are proposed. The computational difficulties have been avoided by using data augmentation. The idea of using a data augmentation framework (with two types of latent variables) is exploited to derive efficient Gibbs sampling and expectation-maximization algorithms. Besides, this formulation has allowed to obtain the probit model as a particular case of the t-link model. Simulation examples are presented to illustrate the model performance when comparing with standard methods that do not consider misclassification. In order to show the potential of the proposed approaches, a real data problem arising when studying hearing loss caused by exposure to occupational noise is analysed.
Robust design techniques, which are based on the concept of building quality into products or processes, are increasingly popular in many manufacturing industries. In this paper, we propose a new robust design model i...
详细信息
Robust design techniques, which are based on the concept of building quality into products or processes, are increasingly popular in many manufacturing industries. In this paper, we propose a new robust design model in the context of pharmaceutical production research and development. Traditional robust design principles have often been applied to situations in which the quality characteristics of interest are typically time insensitive. In pharmaceutical manufacturing processes, time-oriented quality characteristics, such as the degradation of a drug, are often of interest. As a result, current robust design models for quality improvement which have been studied in the literature may not be effective in finding robust design solutions. To address such practical needs, this paper develops a robust design model using censored data, which is perhaps the first attempt in the robust design field. We then study estimation methods, such as the expectation-maximization algorithm and the maximum likelihood method, in the robust design context. Finally, comparative studies are discussed for model verification via a numerical example.
Principal component analysis (PCA) is a widely used statistical technique for determining subscales in questionnaire data. As in any other statistical technique, missing data may both complicate its execution and inte...
详细信息
Principal component analysis (PCA) is a widely used statistical technique for determining subscales in questionnaire data. As in any other statistical technique, missing data may both complicate its execution and interpretation. In this study, six methods for dealing with missing data in the context of PCA are reviewed and compared: listwise deletion (LD), pairwise deletion, the missing data passive approach, regularized PCA, the expectation-maximization algorithm, and multiple imputation. Simulations show that except for LD, all methods give about equally good results for realistic percentages of missing data. Therefore, the choice of a procedure can be based on the ease of application or purely the convenience of availability of a technique.
In this article, we study a generalization of the two-groups model in the presence of covariates-a problem that has recently received much attention in the statistical literature due to its applicability in multiple h...
详细信息
In this article, we study a generalization of the two-groups model in the presence of covariates-a problem that has recently received much attention in the statistical literature due to its applicability in multiple hypotheses testing problems. The model we consider allows for infinite dimensional parameters and offers flexibility in modeling the dependence of the response on the covariates. We discuss the identifiability issues arising in this model and systematically study several estimation strategies. We propose a tuning parameter-free nonparametric maximum likelihood method, implementable via the expectation-maximization algorithm, to estimate the unknown parameters. Further, we derive the rate of convergence of the proposed estimators-in particular we show that the finite sample Hellinger risk for every 'approximate' nonparametric maximum likelihood estimator achieves a near-parametric rate (up to logarithmic multiplicative factors). In addition, we propose and theoretically study two 'marginal' methods that are more scalable and easily implementable. We demonstrate the efficacy of our procedures through extensive simulation studies and relevant data analyses-one arising from neuroscience and the other from astronomy. We also outline the application of our methods to multiple testing. The companion R package NPMLEmix implements all the procedures proposed in this article.
作者:
Ohashi, JunUniv Tsukuba
Doctoral Program Life Syst Med Sci Grad Sch Comprehens Human Sci Tsukuba Ibaraki 3058575 Japan
The association between a copy number variant (CNV) and susceptibility to disease has drawn much attention. In this study, a case-control association test for a CNV locus with multiple alleles is proposed for detectin...
详细信息
The association between a copy number variant (CNV) and susceptibility to disease has drawn much attention. In this study, a case-control association test for a CNV locus with multiple alleles is proposed for detecting a single CNV allele associated with a disease. In the association test, CNV allele frequencies are estimated for cases and controls separately using an expectation-maximization (EM) algorithm, and the chi(2) values are calculated for each CNV allele to compare the estimated frequency between them. A permutation procedure is used to obtain an empirical P-value for each CNV allele and for controlling a global type I error rate. The statistical power of the present association test was evaluated by a computer simulation analysis with several parameter settings. The results revealed that the statistical power was markedly different among CNV alleles with different copy numbers, and a higher power could be achieved for a susceptible allele with the lowest or highest copy number in comparison with those with intermediate copy numbers. Journal of Human Genetics (2009) 54, 169-173;doi: 10.1038/jhg.2009.8;published online 6 February 2009
Although change-point analysis methods for longitudinal data have been developed, it is often of interest to detect multiple change points in longitudinal data. In this paper, we propose a linear mixed effects modelin...
详细信息
Although change-point analysis methods for longitudinal data have been developed, it is often of interest to detect multiple change points in longitudinal data. In this paper, we propose a linear mixed effects modeling framework for identifying multiple change points in longitudinal Gaussian data. Specifically, we develop a novel statistical and computational framework that integrates the expectation-maximization and the dynamic programming algorithms. We conduct a comprehensive simulation study to demonstrate the performance of our method. We illustrate our method with an analysis of data from a trial evaluating a behavioral intervention for the control of type I diabetes in adolescents with HbA1c as the longitudinal response variable. Copyright (c) 2013 John Wiley & Sons, Ltd.
暂无评论