Zeros in compositional data are very common and can be classified into rounded and essential zeros. The rounded zero refers to a small proportion or below detection limit value, while the essential zero refers to the ...
详细信息
Zeros in compositional data are very common and can be classified into rounded and essential zeros. The rounded zero refers to a small proportion or below detection limit value, while the essential zero refers to the complete absence of the component in the composition. In this article, we propose a new framework for analyzing compositional data with zero entries by introducing a stochastic representation. In particular, a new distribution, namely the Dirichlet composition distribution, is developed to accommodate the possible essential-zero feature in compositional data. We derive its distributional properties (e.g., its moments). The calculation of maximum likelihood estimates via the Expectation-Maximization (em) algorithm will be proposed. The regression model based on the new Dirichlet composition distribution will be considered. Simulation studies are conducted to evaluate the performance of the proposed methodologies. Finally, our method is employed to analyze a dataset of fluorescence in situ hybridization (FISH) for chromosome detection.
There are two fundamental contributions of this work. On the application side, one of the most challenging problems is tackled, predicting day-ahead crypto-currency prices. On the theoretical front, a new dynamical mo...
详细信息
There are two fundamental contributions of this work. On the application side, one of the most challenging problems is tackled, predicting day-ahead crypto-currency prices. On the theoretical front, a new dynamical modeling approach is proposed. The proposed approach keeps the probabilistic formulation of the State-Space Model that yields point estimates along with the uncertainty about the estimate and the function approximation ability of the deep neural network. We call the proposed approach the deep state-space model. The experiments are carried out on established cryptocurrencies (obtained from Yahoo Finance). The goal of the work has been to predict the price for the next day. Benchmarking has been done with both state-of-the-art and classical dynamical modeling techniques. Results show that the proposed approach yields the best overall results in terms of accuracy.(c) 2022 Elsevier Inc. All rights reserved.
A method for generalized linear regression with interval-censored covariates is described, extending previous approaches. A scenario is considered in which an interval-censored covariate of interest is defined as a fu...
详细信息
A method for generalized linear regression with interval-censored covariates is described, extending previous approaches. A scenario is considered in which an interval-censored covariate of interest is defined as a function of other variables. Instead of directly modeling the distribution of the interval-censored covariate of interest, the distributions of the variables which determine that covariate are modeled, and the distribution of the covariate of interest is inferred indirectly. This approach leads to an estimation procedure using the Expectation-Maximization (em) algorithm. The performance of this approach is compared to two alternative approaches, one in which the censoring interval midpoints are used as estimates of the censored covariate values, and another in which the censored values are multiply imputed using uniform distributions over the censoring intervals. A simulation framework is constructed to assess these methods' accuracies across a range of scenarios. The proposed approach is found to have less bias than midpoint analysis and uniform imputation, at the cost of small increases in standard error.
In this paper, we propose a new estimation methodology based on a projected non-linear conjugate gradient (PNCG) algorithm with an efficient line search technique. We develop a general PNCG algorithm for a survival mo...
详细信息
In this paper, we propose a new estimation methodology based on a projected non-linear conjugate gradient (PNCG) algorithm with an efficient line search technique. We develop a general PNCG algorithm for a survival model incorporating a proportion cure under a competing risks setup, where the initial number of competing risks are exposed to elimination after an initial treatment (known as destruction). In the literature, expectation maximization (em) algorithm has been widely used for such a model to estimate the model parameters. Through an extensive Monte Carlo simulation study, we compare the performance of our proposed PNCG with that of the em algorithm and show the advantages of our proposed method. Through simulation, we also show the advantages of our proposed methodology over other optimization algorithms (including other conjugate gradient type methods) readily available as R software packages. To show these, we assume the initial number of competing risks to follow a negative binomial distribution although our general algorithm allows one to work with any competing risks distribution. Finally, we apply our proposed algorithm to analyze a well-known melanoma data.
Joint modeling of survival and longitudinal data has been studied extensively in the recent literature. The likelihood approach is one of the most popular estimation methods employed within the joint modeling framewor...
详细信息
Joint modeling of survival and longitudinal data has been studied extensively in the recent literature. The likelihood approach is one of the most popular estimation methods employed within the joint modeling framework. Typically, the parameters are estimated using maximum likelihood, with computation performed by the expectation maximization (em) algorithm. However, one drawback of this approach is that standard error (SE) estimates are not automatically produced when using the em algorithm. Many different procedures have been proposed to obtain the asymptotic covariance matrix for the parameters when the number of parameters is typically small. In the joint modeling context, however, there may be an infinite-dimensional parameter, the baseline hazard function, which greatly complicates the problem, so that the existing methods cannot be readily applied. The profile likelihood and the bootstrap methods overcome the difficulty to some extent;however, they can be computationally intensive. In this paper, we propose two new methods for SE estimation using the em algorithm that allow for more efficient computation of the SE of a subset of parametric components in a semiparametric or high-dimensional parametric model. The precision and computation time are evaluated through a thorough simulation study. We conclude with an application of our SE estimation method to analyze an HIV clinical trial dataset.
The Expectation-Maximization (em) algorithm is widely used also in industry for parameter estimation within a Maximum Likelihood (ML) framework in case of missing data. It is well-known that em shows good convergence ...
详细信息
The Expectation-Maximization (em) algorithm is widely used also in industry for parameter estimation within a Maximum Likelihood (ML) framework in case of missing data. It is well-known that em shows good convergence in several cases of practical interest. To the best of our knowledge, results showing under which conditions em converges fast are only available for specific cases. In this paper, we analyze the connection of the em algorithm to other ascent methods as well as the convergence rates of the em algorithm in general including also nonlinear models and apply this to the PMHT model. We compare the em with other known iterative schemes such as gradient and Newton-type methods. It is shown that em reaches Newton-convergence in case of well-separated objects and a Newton-em combination turns out to be robust and efficient even in cases of closely-spaced targets.
Many large-scale surveys collect both discrete and continuous variables. Small-area estimates may be desired for means of continuous variables, proportions in each level of a categorical variable, or for domain means ...
详细信息
Many large-scale surveys collect both discrete and continuous variables. Small-area estimates may be desired for means of continuous variables, proportions in each level of a categorical variable, or for domain means defined as the mean of the continuous variable for each level of the categorical variable. In this paper, we introduce a conditionally specified bivariate mixed-effects model for small-area estimation, and provide a necessary and sufficient condition under which the conditional distributions render a valid joint distribution. The conditional specification allows better model interpretation. We use the valid joint distribution to calculate empirical Bayes predictors and use the parametric bootstrap to estimate the mean squared error. Simulation studies demonstrate the superior performance of the bivariate mixed-effects model relative to univariate model estimators. We apply the bivariate mixed-effects model to construct estimates for small watersheds using data from the Conservation Effects Assessment Project, a survey developed to quantify the environmental impacts of conservation efforts. We construct predictors of mean sediment loss, the proportion of land where the soil loss tolerance is exceeded, and the average sediment loss on land where the soil loss tolerance is exceeded. In the data analysis, the bivariate mixed-effects model leads to more scientifically interpretable estimates of domain means than those based on two independent univariate models.
An important property that any lifetime model should satisfy is scale *** this paper,a new scale-invariant quasi-inverse Lindley(QIL)model is presented and *** basic properties,including moments,quantiles,skewness,kur...
详细信息
An important property that any lifetime model should satisfy is scale *** this paper,a new scale-invariant quasi-inverse Lindley(QIL)model is presented and *** basic properties,including moments,quantiles,skewness,kurtosis,and Lorenz curve,have been *** addition,the well-known dynamic reliability measures,such as failure rate(FR),reversed failure rate(RFR),mean residual life(MRL),mean inactivity time(MIT),quantile residual life(QRL),and quantile inactivity time(QIT)are *** FR function considers the decreasing or upside-down bathtub-shaped,and the MRL and median residual lifetime may have a bathtub-shaped *** parameters of the model are estimated by applying the maximum likelihood method and the expectation-maximization(em)*** em algorithm is an iterative method suitable for models with a latent variable,for example,when we have mixture or competing risk models.A simulation study is then conducted to examine the consistency and efficiency of the estimators and compare *** simulation study shows that the em approach provides a better estimation of the ***,the proposed model is fitted to a reliability engineering data set along with some *** Akaike information criterion(AIC),Kolmogorov-Smirnov(K-S),Cramer-von Mises(CVM),and Anderson Darling(AD)statistics are used to compare the considered models.
Lomax distribution has been widely used in economics, business and actuarial sciences. Due to its importance, we consider the statistical inference of this model under joint type-II censoring scenario. In order to est...
详细信息
Lomax distribution has been widely used in economics, business and actuarial sciences. Due to its importance, we consider the statistical inference of this model under joint type-II censoring scenario. In order to estimate the parameters, we derive the Newton-Raphson(NR) procedure and we observe that most of the times in the simulation NR algorithm does not converge. Consequently, we make use of the expectation-maximization (em) algorithm. Moreover, Bayesian estimations are also provided based on squared error, linear-exponential and generalized entropy loss functions together with the importance sampling method due to the structure of posterior density function. In the sequel, we perform a Monte Carlo simulation experiment to compare the performances of the listed methods. Mean squared error values, averages of estimated values as well as coverage probabilities and average interval lengths are considered to compare the performances of different methods. The approximate confidence intervals, bootstrap-p and bootstrap-t confidence intervals are computed for em estimations. Also, Bayesian coverage probabilities and credible intervals are obtained. Finally, we consider the Bladder Cancer data to illustrate the applicability of the methods covered in the paper.
With the rapid development of metro systems, it has become increasingly important to study phenomena such as passenger flow distribution and passenger boarding behavior. It is difficult for existing methods to accurat...
详细信息
With the rapid development of metro systems, it has become increasingly important to study phenomena such as passenger flow distribution and passenger boarding behavior. It is difficult for existing methods to accurately describe actual situations and to extend to the whole metro system due to the limitations from parameter uncertainties in their mathematical models. In this article, we propose a passenger-to-train assignment model to evaluate the probabilities of individual passengers boarding each feasible train for both no-transfer and one-transfer situations. This model can be used to understand passenger flows and crowdedness. The input parameters of the model include the probabilities that the passengers take each train and the probability distribution of egress time, which is the time to walk to the tap-out fare gate after alighting from the train. We present the likelihood method to estimate these parameters based on data from the automatic fare collection and automatic vehicle location systems. This method can construct several nonparametric density estimates without assuming the parametric form of the distribution of egress time. The em algorithm is used to compute the maximum likelihood estimates. Simulation results indicate that the proposed estimates perform well. By applying our method to real data in Beijing metro system, we can identify different passenger flow patterns between peak and off-peak hours.
暂无评论