In many of the real-life situations, the strength of a system and stress applied to it changes as time changes. In this paper, we consider time-dependent stress-strength reliability models subjected to random stresses...
详细信息
In many of the real-life situations, the strength of a system and stress applied to it changes as time changes. In this paper, we consider time-dependent stress-strength reliability models subjected to random stresses at random cycles of time. Each run of the system causes a change in the strength of the system over time. We obtain the stress-strength reliability of the system at time t when the initial stress and initial strength of the system follow continuous phase type distribution and the time taken for completing a run, called the cycle time, is a random variable which is assumed to have exponential, gamma or Weibull distribution. Using simulated data sets we have studied the variation in stress-strength reliability at different time points corresponding to different sets of parameters of the model.
Semicontinuous data, characterized by a sizable number of zeros and observations from a continuous distribution, are frequently encountered in health research concerning food consumptions, physical activities, medical...
详细信息
Semicontinuous data, characterized by a sizable number of zeros and observations from a continuous distribution, are frequently encountered in health research concerning food consumptions, physical activities, medical and pharmacy claims expenditures, and many others. In analyzing such semicontinuous data, it is imperative that the excessive zeros be adequately accounted for to obtain unbiased and efficient inference. Although many methods have been proposed in the literature for the modeling and analysis of semicontinuous data, little attention has been given to clustering of semicontinuous data to identify important patterns that could be indicative of certain health outcomes or intervention effects. We propose a Bernoulli-normal mixture model for clustering of multivariate semicontinuous data and demonstrate its accuracy as compared to the well-known clustering method with the conventional normal mixture model. The proposed method is illustrated with data from a dietary intervention trial to promote healthy eating behavior among children with type 1 diabetes. In the trial, certain diabetes friendly foods (eg, total fruit, whole fruit, dark green and orange vegetables and legumes, whole grain) were only consumed by a proportion of study participants, yielding excessive zero values due to nonconsumption of the foods. Baseline foods consumptions data in the trial are used to explore preintervention dietary patterns among study participants. While the conventional normal mixture model approach fails to do so, the proposed Bernoulli-normal mixture model approach has shown to be able to identify a dietary profile that significantly differentiates the intervention effects from others, as measured by the popular healthy eating index at the end of the trial.
In this paper, we consider the problem of making statistical inference for a truncated normal distribution under progressive type I interval censoring. We obtain maximum likelihood estimators of unknown parameters usi...
详细信息
In this paper, we consider the problem of making statistical inference for a truncated normal distribution under progressive type I interval censoring. We obtain maximum likelihood estimators of unknown parameters using the expectation-maximization algorithm and in sequel, we also compute corresponding midpoint estimates of parameters. Estimation based on the probability plot method is also considered. Asymptotic confidence intervals of unknown parameters are constructed based on the observed Fisher information matrix. We obtain Bayes estimators of parameters with respect to informative and non-informative prior distributions under squared error and linex loss functions. We compute these estimates using the importance sampling procedure. The highest posterior density intervals of unknown parameters are constructed as well. We present a Monte Carlo simulation study to compare the performance of proposed point and interval estimators. Analysis of a real data set is also performed for illustration purposes. Finally, inspection times and optimal censoring plans based on the expected Fisher information matrix are discussed.
Statistical matching is a technique to combine variables in two or more nonoverlapping samples that are drawn from the same population. In the current study, the unobserved joint distribution between two target variab...
详细信息
Statistical matching is a technique to combine variables in two or more nonoverlapping samples that are drawn from the same population. In the current study, the unobserved joint distribution between two target variables in nonoverlapping samples is estimated using a parametric model. A classical assumption to estimate this joint distribution is that the target variables are independent given the background variables observed in both samples. A problem with the use of this conditional independence assumption is that the estimated joint distribution may be severely biased when the assumption does not hold, which in general will be unacceptable for official statistics. Here, we explored to what extent the accuracy can be improved by the use of two types of auxiliary information: the use of a common administrative variable and the use of a small additional sample from a similar population. This additional sample is included by using the partial correlation of the target variables given the background variables or by using an em algorithm. In total, four different approaches were compared to estimate the joint distribution of the target variables. Starting with empirical data, we show how the accuracy of the joint distribution is affected by the use of administrative data and by the size of the additional sample included via a partial correlation and through an em algorithm. The study further shows how this accuracy depends on the strength of the relations among the target and auxiliary variables. We found that including a common administrative variable does not always improve the accuracy of the results. We further found that the em algorithm nearly always yielded the most accurate results;this effect is largest when the explained variance of the separate target variables by the common background variables is not large.
Record linkage addresses the problem of identifying pairs of records coming from different sources and referred to the same unit of interest. Fellegi and Sunter propose an optimal statistical test in order to assign t...
详细信息
Record linkage addresses the problem of identifying pairs of records coming from different sources and referred to the same unit of interest. Fellegi and Sunter propose an optimal statistical test in order to assign the match status to the candidate pairs, in which the needed parameters are obtained through em algorithm directly applied to the set of candidate pairs, without recourse to training data. However, this procedure has a quadratic complexity as the two lists to be matched grow. In addition, a large bias of em-estimated parameters is also produced in this case, so that the problem is tackled by reducing the set of candidate pairs through filtering methods such as blocking. Unfortunately, the probability that excluded pairs would be actually true-matches cannot be assessed through such methods. The present work proposes an efficient approach in which the comparison of records between lists are minimised while the em estimates are modified by modelling tables with structural zeros in order to obtain unbiased estimates of the parameters. Improvement achieved by the suggested method is shown by means of simulations and an application based on real data.
The multivariate skewed variance gamma (MSVG) distribution is useful in modelling data with high density around the location parameter along with moderate heavy-tailedness. However, the density can be unbounded for ce...
详细信息
The multivariate skewed variance gamma (MSVG) distribution is useful in modelling data with high density around the location parameter along with moderate heavy-tailedness. However, the density can be unbounded for certain choices of shape parameter. We propose a modification to the expectation-conditional maximisation (ECM) algorithm to calculate the maximum likelihood estimate (MLE) by introducing a small region to cap the conditional expectations in order to deal with the unbounded density. To facilitate application to financial time series, the mean is further extended to include autoregressive terms. Finally, the MSVG model is applied to analyse the returns of five daily closing price market indices. Standard error (SE) for the estimated parameters are computed using Louis' method.
Today, short- and long-term structural health monitoring (SHM) of bridge infrastructures and their safe, reliable and cost-effective maintenance has received considerable attention. From a surveying or civil engineer&...
详细信息
Today, short- and long-term structural health monitoring (SHM) of bridge infrastructures and their safe, reliable and cost-effective maintenance has received considerable attention. From a surveying or civil engineer's point of view, vibration-based SHM can be conducted by inspecting the changes in the global dynamic behaviour of a structure, such as natural frequencies (i.e. eigenfrequencies), mode shapes (i.e. eigenforms) and modal damping, which are known as modal parameters. This research work aims to propose a robust and automatic vibration analysis procedure that is so-called robust time domain modal parameter identification (RT-MPI) technique. It is novel in the sense of automatic and reliable identification of initial eigenfrequencies even closely spaced ones as well as robustly and accurately estimating the modal parameters of a bridge structure using low numbers of cost-effective micro-electro-mechanical systems (MemS) accelerometers. To estimate amplitude, frequency, phase shift and damping ratio coefficients, an observation model consisting of: (1) a damped harmonic oscillation model, (2) an autoregressive model of coloured measurement noise and (3) a stochastic model in the form of the heavy-tailed family of scaled t-distributions is employed and jointly adjusted by means of a generalised expectation maximisation algorithm. Multiple MemS as part of a geo-sensor network were mounted at different positions of a bridge structure which is precalculated by means of a finite element model (Fem) analysis. At the end, the estimated eigenfrequencies and eigenforms are compared and validated by the estimated parameters obtained from acceleration measurements of high-end accelerometers of type PCB ICP quartz, velocity measurements from a geophone and the Fem analysis. Additionally, the estimated eigenfrequencies and modal damping are compared with a well-known covariance driven stochastic subspace identification approach, which reveals the superiority of our propo
Here, we consider time-to-event data where individuals can experience two or more types of events that are not distinguishable from one another without further confirmation, perhaps by laboratory test. The event type ...
详细信息
Here, we consider time-to-event data where individuals can experience two or more types of events that are not distinguishable from one another without further confirmation, perhaps by laboratory test. The event type of primary interest can occur only once. The other types of events can recur. If the type of a portion of the events is identified, this forms a validation set. However, even if a random sample of events are tested, confirmations can be missing nonmonotonically, creating uncertainty about whether an individual is still at risk for the event of interest. For example, in a study to estimate efficacy of an influenza vaccine, an individual may experience a sequence of symptomatic respiratory illnesses caused by various pathogens over the season. Often only a limited number of these episodes are confirmed in the laboratory to be influenza-related or not. We propose two novel methods to estimate covariate effects in this survival setting, and subsequently vaccine efficacy. The first is a pathway expectation-maximization (em) algorithm that takes into account all pathways of event types in an individual compatible with that individual's test outcomes. The pathway em iteratively estimates baseline hazards that are used to weight possible event types. The second method is a non-iterative pathway piecewise validation method that does not estimate the baseline hazards. These methods are compared with a previous simpler method. Simulation studies suggest mean squared error is lower in the efficacy estimates when the baseline hazards are estimated, especially at higher hazard rates. We use the pathway em-algorithm to reevaluate the efficacy of a trivalent live-attenuated influenza vaccine during the 2003-2004 influenza season in Temple-Belton, Texas, and compare our results with a previously published analysis.
With the emergence of numerical sensors in many aspects of everyday life, there is an increasing need in analyzing multivariate functional data. This work focuses on the clustering of such functional data, in order to...
详细信息
With the emergence of numerical sensors in many aspects of everyday life, there is an increasing need in analyzing multivariate functional data. This work focuses on the clustering of such functional data, in order to ease their modeling and understanding. To this end, a novel clustering technique for multivariate functional data is presented. This method is based on a functional latent mixture model which fits the data into group-specific functional subspaces through a multivariate functional principal component analysis. A family of parsimonious models is obtained by constraining model parameters within and between groups. An Expectation Maximization algorithm is proposed for model inference and the choice of hyper-parameters is addressed through model selection. Numerical experiments on simulated datasets highlight the good performance of the proposed methodology compared to existing works. This algorithm is then applied to the analysis of the pollution in French cities for 1 year.
Misclassified current status data occur when each subject under study is observed only once and the failure status at the observation time is determined by a diagnostic test with imperfect sensitivity and specificity....
详细信息
Misclassified current status data occur when each subject under study is observed only once and the failure status at the observation time is determined by a diagnostic test with imperfect sensitivity and specificity. In this article, we provide a methodology for the analysis of such data under a wide class of flexible semiparametric transformation models. For inference, a nonparametric maximum likelihood estimation procedure is proposed along with the development of an em algorithm. Furthermore, we show that the resulting estimators of regression parameters are consistent, asymptotically normal and semiparametrically efficient. A simulation study and a real data application demonstrate that the proposed approach performs well in practice and has substantial superiority over the naive method that ignores the misclassification.
暂无评论