This paper introduces range moment (RM), range skewness (RS), and kurtosis (RK). We derive the formula of RM for the univariate elliptical family, with emphasis on normal, student-t, logistic, Laplace, and Pearson typ...
详细信息
This paper introduces range moment (RM), range skewness (RS), and kurtosis (RK). We derive the formula of RM for the univariate elliptical family, with emphasis on normal, student-t, logistic, Laplace, and Pearson type VII distributions. We also present explicit expressions of the range Value-at-Risk (RVaR), variance (RV), RS, and RK for those distributions. Moreover, the RVaR for sum of elliptical risks is derived. In addition, we provide the maximum likelihood estimation of parameters for elliptical family via the em type algorithm. The range mean-variance optimal portfolio selection is given as an application. As illustrative examples, RVaRs, RVs, RSs, and RKs of different distributions in the elliptical family are calculated and compared, and then the Monte Carlo method is used to estimate range moments and compared with the theoretical values. Further, we use real data to fit different univariate elliptical distributions, the best distribution is selected using the Akaike information criterion (AIC) and Bayesian information criterion (BIC) methods, respectively. Finally, the RVaRs, RVs, RSs, and RKs of the daily log-returns of four stocks from the Nasdaq stock market are discussed.
Panel binary data arise in an event history study when study subjects are observed only at discrete time points instead of continuously and the only available information on the occurrence of the recurrent event of in...
详细信息
Panel binary data arise in an event history study when study subjects are observed only at discrete time points instead of continuously and the only available information on the occurrence of the recurrent event of interest is whether the event has occurred over two consecutive observation times or each observation window. Although some methods have been proposed for regression analysis of such data, all of them assume independent observation times or processes, which may not be true sometimes. To address this, we propose a joint modeling procedure that allows for informative observation processes. For the implementation of the proposed method, a computationally efficient em algorithm is developed and the resulting estimators are consistent and asymptotically normal. The simulation study conducted to assess its performance indicates that it works well in practical situations, and the proposed approach is applied to the motivating data set from the Health and Retirement Study.
We addressed genomic prediction accounting for partial correlation of marker effects, which entails the estimation of the partial correlation network/graph (PCN) and the precision matrix of an unobservable m-dimension...
详细信息
We addressed genomic prediction accounting for partial correlation of marker effects, which entails the estimation of the partial correlation network/graph (PCN) and the precision matrix of an unobservable m-dimensional random variable. To this end, we developed a set of statistical models and methods by extending the canonical model selection problem in Gaussian concentration, and directed acyclic graph models. Our frequentist formulations combined existing methods with the em algorithm and were termed Glasso-em, Concord-em and CSCS-em, whereas our Bayesian formulations corresponded to hierarchical models termed Bayes G-Sel and Bayes DAG-Sel. We implemented our methods in a real bull fertility dataset and then carried out gene annotation of seven markers having the highest degrees in the estimated PCN. Our findings brought biological evidence supporting the usefulness of identifying genomic regions that are highly connected in the inferred PCN. Moreover, a simulation study showed that some of our methods can accurately recover the PCN (accuracy up to 0.98 using Concord-em), estimate the precision matrix (Concord-em yielded the best results) and predict breeding values (the best reliability was 0.85 for a trait with heritability of 0.5 using Glasso-em).
The additive reserving model assumes the existence of volume measures such that the corresponding expected loss ratios are identical for all accident years. While classical literature assumes these volumes are known, ...
详细信息
The additive reserving model assumes the existence of volume measures such that the corresponding expected loss ratios are identical for all accident years. While classical literature assumes these volumes are known, in practice, accurate volume measures are often unavailable. The issue of uncertain volume measures in the additive model was addressed in a generalization of the loss ratio method published in 2018. The derivation is rather complex and the method is computationally intensive, especially for large loss development triangles. This paper introduces an alternative approach that leverages the well-established em algorithm, significantly reducing computational requirements.
Disease registry data provide important information on the progression of disease conditions. However, reports of death or drop-out of patients enrolled in the registry are always subject to a noticeable delay. Report...
详细信息
Disease registry data provide important information on the progression of disease conditions. However, reports of death or drop-out of patients enrolled in the registry are always subject to a noticeable delay. Reporting delays, together with the administrative censoring that arises from a freeze date in data collection, lead to two layers of right censoring in the data. The first layer results from random drop-out and acts on the survival time. The second layer is the administrative censoring, which acts on the sum of the reporting delay and the minimum of the survival time and random drop-out time. The heterogeneities among patients further complicate data analysis. This paper proposes a novel semiparametric sieve method based on phase-type distributions, in which covariates can be readily accommodated by the accelerated failure time model. A well-orchestrated em algorithm is developed to compute the sieve maximum likelihood estimator. We establish the consistency and rate of convergence of the proposed sieve estimators, as well as the asymptotic normality and semiparametric efficiency of the estimators for the regression parameters. Comprehensive simulations and a real example of lung cancer registry data are used to demonstrate the proposed method. The results reveal substantial biases if reporting delays are overlooked.
The expectation-maximisation (em) algorithm can be used to adjust the sample size for the time-to-event endpoint without unblinding. Nevertheless, censoring or unreliable initial estimates may render inconsistent esti...
详细信息
The expectation-maximisation (em) algorithm can be used to adjust the sample size for the time-to-event endpoint without unblinding. Nevertheless, censoring or unreliable initial estimates may render inconsistent estimates by the em algorithm. To address these limitations, we propose a bi-endpoint em algorithm that incorporates the time-to-event endpoint and another endpoint, which can encompass various endpoint types and is not limited to efficacy indicators, during the em iterations. Additionally, we suggest 2 approaches for choosing initial estimates. The application conditions are as follows: (i) at least one endpoint's initial estimate is reliable and (ii) the influence of this endpoint on the posterior distribution of the latent variable exceeds that of another endpoint.
Due to the limitations of the model itself, the performance of switched autoregressive exogenous (SARX) models will face potential threats when modeling nonlinear hybrid dynamic systems. To address this problem, a rob...
详细信息
Due to the limitations of the model itself, the performance of switched autoregressive exogenous (SARX) models will face potential threats when modeling nonlinear hybrid dynamic systems. To address this problem, a robust identification approach of the switched gated recurrent unit (SGRU) model is developed in this paper. Firstly, all submodels of the SARX model are replaced by gated recurrent unit neural networks. The obtained SGRU model has stronger nonlinear fitting ability than the SARX model. Secondly, this paper departs from the conventional Gaussian distribution assumption for noise, opting instead for a generalized Gaussian distribution. This enables the proposed model to achieve stable prediction performance under the influence of different noises. Notably, no prior assumptions are imposed on the knowledge of operating modes in the proposed switched model. Therefore, the em algorithm is used to solve the problem of parameter estimation with hidden variables in this paper. Finally, two simulation experiments are performed. By comparing the nonlinear fitting ability of the SGRU model with the SARX model and the prediction performance of the SGRU model under different noise distributions, the effectiveness of the proposed approach is verified.
The purpose is to establish an automated investing strategy which can imitate an advisor's behaviour in financial market. In view of the above needs, we review previous studies of Markov regime-switching model who...
详细信息
The purpose is to establish an automated investing strategy which can imitate an advisor's behaviour in financial market. In view of the above needs, we review previous studies of Markov regime-switching model whose duration is geometrically distributed, propose a semi-Markov regime-switching model whose duration has a general distribution. By extending the state space of the semi-Markov chain, the model is transformed to a Markov regime-switching model. As the full information of the semi-Markov regime-switching model is available in the issue, we propose a divide-and-conquer and computationally tractable algorithm to estimate parameters. Experiments with empirical datasets show that the automated investing strategy based on estimated parameters behaves like the investment advisor. For an investment advisor, the automated investing strategy can help the advisor to avoid boring routines, and evaluate the advisor's advice thoroughly.
Model-based clustering tackles the task of uncovering heterogeneity in a data set to extract valuable insights. Given the common presence of outliers in practice, robust methods for model-based clustering have been pr...
详细信息
Model-based clustering tackles the task of uncovering heterogeneity in a data set to extract valuable insights. Given the common presence of outliers in practice, robust methods for model-based clustering have been proposed. However, the use of many methods in this area becomes severely limited in applications where partially observed records are common since their existing frameworks often assume complete data only. Here, a mixture of multiple scaled contaminated normal (MSCN) distributions is extended using the expectation-conditional maximization (ECM) algorithm to accommodate data sets with values missing at random. The newly proposed extension preserves the mixture's capability in yielding robust parameter estimates and performing automatic outlier detection separately for each principal component. In this fitting framework, the MSCN marginal density is approximated using the inversion formula for the characteristic function. Extensive simulation studies involving incomplete data sets with outliers are conducted to evaluate parameter estimates and to compare clustering performance and outlier detection of our model to other mixtures.
In this paper, we propose a new method to deal with uncertain data in the context of Common Cause Failure (CCF) analysis. Uncertain CCF data refer to the data for which the number of components involved in the failure...
详细信息
In this paper, we propose a new method to deal with uncertain data in the context of Common Cause Failure (CCF) analysis. Uncertain CCF data refer to the data for which the number of components involved in the failure events is not exactly known. We introduce a new formalism to describe uncertain CCF data to avoid subjective probabilities for the number of failed components in each CCF event that are used in classical methods such as the impact vector method. The parameters of the alpha-factor model are estimated using the maximum likelihood method relying on properties of the nested Dirichlet distribution and grouped Dirichlet distribution. A data augmentation technique with an expectation-maximization algorithm is also developed for some schemes of data with uncertainty. Finally, we evaluate the performance of the proposed method through numerical simulations and illustrate its application using an example from the literature.
暂无评论