We introduce a one-step em algorithm to estimate the graphical structure in a Poisson-Log-Normal graphical model. This procedure is equivalent to a normality transformation that makes the problem of identifying relati...
详细信息
We introduce a one-step em algorithm to estimate the graphical structure in a Poisson-Log-Normal graphical model. This procedure is equivalent to a normality transformation that makes the problem of identifying relationships in high-throughput microRNA (miRNA) sequence data feasible. The Poisson-log-normal model moreover allows us to directly account for known overdispersion relationships present in this data set. We show that our em algorithm provides a provable increase in performance in determining the network structure. The model is shown to provide an increase in performance in simulation settings over a range of network structures. The model is applied to high-throughput miRNA sequencing data from patients with breast cancer from The Cancer Genome Atlas (TCGA). By selecting the most highly connected miRNA molecules in the fitted network we find that nearly all of them are known to be involved in the regulation of breast cancer.
Maximum likelihood (ML) estimation of spatial autocorrelation models is well established for the case where each node in the graph is directly observed. When one or more nodes are not observed, the user has a variety ...
详细信息
Maximum likelihood (ML) estimation of spatial autocorrelation models is well established for the case where each node in the graph is directly observed. When one or more nodes are not observed, the user has a variety of computational tools at her or his disposal ranging from the expectation-maximization algorithm, which has become a standard for missing-data problems, to marginal likelihood estimation methods and to fully Bayesian approaches. In this article, we give a comprehensive overview of likelihood-based computational frameworks for parameter estimation of the conditional autoregressive model, and we establish connections with several algorithms in the literature that are iterative and often computationally suboptimal. We show that a vanilla marginal ML approach, which we provide computational details for, is still generally orders of magnitude faster than the iterative approaches, even on large data sets and especially so when the number of unobserved units is relatively large.
In this paper, the Rayleigh-Lindley (RL) distribution is introduced, obtained by compounding the Rayleigh and Lindley discrete distributions, where the compounding procedure follows an approach similar to the one prev...
详细信息
In this paper, the Rayleigh-Lindley (RL) distribution is introduced, obtained by compounding the Rayleigh and Lindley discrete distributions, where the compounding procedure follows an approach similar to the one previously studied by Adamidis and Loukas in some other contexts. The resulting distribution is a two-parameter model, which is competitive with other parsimonious models such as the gamma and Weibull distributions. We study some properties of this new model such as the moments and the mean residual life. The estimation was approached via em algorithm. The behavior of these estimators was studied in finite samples through a simulation study. Finally, we report two real data illustrations in order to show the performance of the proposed model versus other common two-parameter models in the literature. The main conclusion is that the model proposed can be a valid alternative to other competing models well established in the literature.
The problem of parameters estimation plays a significant role in various areas of academic researches. In this article, we propose three different methods of estimation for the parameters of location-scale family unde...
详细信息
The problem of parameters estimation plays a significant role in various areas of academic researches. In this article, we propose three different methods of estimation for the parameters of location-scale family under ranked set sampling in the view of missing data mechanism. Through a series of Monte Carlo simulations, it is well investigated that the proposed methods are relatively robust from violating the perfect ranking condition and provide better performance over their competitors using bias and MSE (mean square error) criteria. An empirical data set is also used for illustrative purposes.
In this paper, we propose an approach for modeling claim dependence, with the assumption that the claim numbers and the aggregate claim amounts are mutually and serially dependent through an underlying hidden state an...
详细信息
In this paper, we propose an approach for modeling claim dependence, with the assumption that the claim numbers and the aggregate claim amounts are mutually and serially dependent through an underlying hidden state and can be characterized by a hidden finite state Markov chain using bivariate Hidden Markov Model (BHMM). We construct three different BHMMs, namely Poisson-Normal HMM, Poisson-Gamma HMM, and Negative Binomial- Gamma HMM, stemming from the most commonly used distributions in insurance studies. Expectation Maximization algorithm is implemented and for the maximization of the state-dependent part of log-likelihood of BHMMs, the estimates are derived analytically. To illustrate the proposed model, motor third-party liability claims in Istanbul, Turkey, are employed in the frame of Poisson-Normal HMM under a different number of states. In addition, we derive the forecast distribution, calculate state predictions, and determine the most likely sequence of states. The results indicate that the dependence under indirect factors can be captured in terms of different states, namely low, medium, and high states.
In meta-analysis of clinical trials, standard statistical methods run into problems when the proportions of safety events are small. Motivated by the dataset used in a published analysis of cardiovascular safety in Ro...
详细信息
In meta-analysis of clinical trials, standard statistical methods run into problems when the proportions of safety events are small. Motivated by the dataset used in a published analysis of cardiovascular safety in Rosiglitazone trials, this article proposes using a zero-inflated binomial model to handle the zero-event trials. The maximum likelihood estimates of the model parameters are obtained using the expectation and maximization algorithm. Via simulation studies, it is shown that the proposed methods provide estimates of odds ratios with less bias and variation, compared with both the Mantel-Hanszel method with continuity correction and Peto's method. The proposed methods are applied to the Rosiglitazone trials. for this article are available online.
This paper proposes a maximum likelihood approach to jointly estimate marginal conditional quantiles of multivariate response variables in a linear regression framework. We consider a slight reparameterization of the ...
详细信息
This paper proposes a maximum likelihood approach to jointly estimate marginal conditional quantiles of multivariate response variables in a linear regression framework. We consider a slight reparameterization of the multivariate asymmetric Laplace distribution proposed by Kotz et al. (2001) and exploit its location-scale mixture representation to implement a new em algorithm for estimating model parameters. The idea is to extend the link between the asymmetric Laplace distribution and the well-known univariate quantile regression model to a multivariate context, i.e., when a multivariate dependent variable is concerned. The approach accounts for association among multiple responses and studies how the relationship between responses and explanatory variables can vary across different quantiles of the marginal conditional distribution of the responses. A penalized version of the em algorithm is also presented to tackle the problem of variable selection. The validity of our approach is analyzed in a simulation study, where we also provide evidence on the efficiency gain of the proposed method compared to estimation obtained by separate univariate quantile regressions. A real data application examines the main determinants of financial distress in a sample of Italian firms. (C) 2019 Elsevier Inc. All rights reserved.
The endo-exo problem lies at the heart of statistical identification in many fields of science, and is often plagued by spurious strong-and-long memory due to improper treatment of trends, shocks and shifts in the dat...
详细信息
The endo-exo problem lies at the heart of statistical identification in many fields of science, and is often plagued by spurious strong-and-long memory due to improper treatment of trends, shocks and shifts in the data. A class of models that has shown to be useful in discerning exogenous and endogenous activity is the Hawkes process. This class of point processes has enjoyed great recent popularity and rapid development within the quantitative finance literature, with particular focus on the study of market microstructure and high frequency price fluctuations. We show that there are important lessons from older fields like time series and econometrics that should also be applied in financial point process modelling. In particular, we emphasize the importance of appropriately treating trends and shocks for the identification of the strength and length of memory in the system. We exploit the powerful Expectation Maximization algorithm and objective statistical criteria (BIC) to select the flexibility of the deterministic background intensity. With these methods, we strongly reject the hypothesis that the considered financial markets are critical at univariate and bivariate microstructural levels.
Complexity of longitudinal data lies in the inherent dependence among measurements from same subject over different time points. For multiple longitudinal responses, the problem is challenging due to inter-trait and i...
详细信息
Complexity of longitudinal data lies in the inherent dependence among measurements from same subject over different time points. For multiple longitudinal responses, the problem is challenging due to inter-trait and intra-trait dependence. While linear mixed models are popularly used for analysing such data, appropriate inference on the shape of the population cannot be drawn for non-normal data sets. We propose a linear mixed model for joint quantile regression of multiple longitudinal responses. We consider an asymmetric Laplace distribution for quantile regression and estimate model parameters by Monte Carlo em algorithm. Nonparametric bootstrap resampling method is used for estimating confidence intervals of parameter estimates. Through extensive simulation studies, we investigate the operating characteristics of our proposed model and compare the performance to other traditional quantile regression models. We apply proposed model for analysing data from nutrition education programme on hypercholesterolemic children of the USA.
Censored failure time data with a cured subgroup is frequently encountered in many scientific areas including the cancer screening research, tumorigenicity studies, and sociological surveys. Meanwhile, one may also en...
详细信息
Censored failure time data with a cured subgroup is frequently encountered in many scientific areas including the cancer screening research, tumorigenicity studies, and sociological surveys. Meanwhile, one may also encounter an extraordinary large number of risk factors in practice, such as patient's demographic characteristics, clinical measurements, and medical history, which makes variable selection an emerging need in the data analysis. Motivated by a medical study on prostate cancer screening, we develop a variable selection method in the semiparametric nonmixture or promotion time cure model when interval-censored data with a cured subgroup are present. Specifically, we propose a penalized likelihood approach with the use of the least absolute shrinkage and selection operator, adaptive least absolute shrinkage and selection operator, or smoothly clipped absolute deviation penalties, which can be easily accomplished via a novel penalized expectation-maximization algorithm. We assess the finite-sample performance of the proposed methodology through extensive simulations and analyze the prostate cancer screening data for illustration.
暂无评论