Recently, it has been observed that the bivariate generalized linear failure rate distribution can be used quite effectively to analyze lifetime data in two dimensions. This paper introduces a more general class of bi...
详细信息
Recently, it has been observed that the bivariate generalized linear failure rate distribution can be used quite effectively to analyze lifetime data in two dimensions. This paper introduces a more general class of bivariate distributions. We refer to this new class of distributions as bivariate generalized linear failure rate-power series model. This new class of bivariate distributions contains several lifetime models such as: generalized linear failure rate-power series, bivariate generalized linear failure rate, and bivariate generalized linear failure rate geometric distributions as special cases among others. The construction and characteristics of the proposed bivariate distribution are presented along with estimation procedures for the model parameters based on maximum likelihood. The marginal and conditional laws are also studied. We present an application to the real data set, where our model provides a better fit than other models.
The composite Bernstein copula (CBC) (Yang et al., 2015) is a copula function generated from a composition of two copulas. This paper first shows that some well-known copulas belong to the CBC family with desirable pr...
详细信息
The composite Bernstein copula (CBC) (Yang et al., 2015) is a copula function generated from a composition of two copulas. This paper first shows that some well-known copulas belong to the CBC family with desirable properties. An em algorithm for estimating the CBC is proposed, and it is applied for a real dataset to show the fitting result of the CBC in modeling dependence. The probabilistic structure for the CBC family is presented, which is useful for generating random numbers from the CBC. Finally, the probabilistic structure of the CBC is applied to credit risk analysis of collateralized debt obligations to show its advantage in empirical analysis. (C) 2017 Elsevier B.V. All rights reserved.
Restricted versions of the cointegrated vector autoregression are usually estimated using switching algorithms. These algorithms alternate between two sets of variables but can be slow to converge. Acceleration method...
详细信息
Restricted versions of the cointegrated vector autoregression are usually estimated using switching algorithms. These algorithms alternate between two sets of variables but can be slow to converge. Acceleration methods are proposed that combine simplicity and effectiveness. These methods also outperform existing proposals in some applications of the expectation-maximization method and parallel factor analysis.
Although the item count technique is useful in surveys with sensitive questions, privacy of those respondents who possess the sensitive characteristic of interest may not be well protected due to a defect in its origi...
详细信息
Although the item count technique is useful in surveys with sensitive questions, privacy of those respondents who possess the sensitive characteristic of interest may not be well protected due to a defect in its original design. In this article, we propose two new survey designs (namely the Poisson item count technique and negative binomial item count technique) which replace several independent Bernoulli random variables required by the original item count technique with a single Poisson or negative binomial random variable, respectively. The proposed models not only provide closed form variance estimate and confidence interval within [0, 1] for the sensitive proportion, but also simplify the survey design of the original item count technique. Most importantly, the new designs do not leak respondents' privacy. empirical results show that the proposed techniques perform satisfactorily in the sense that it yields accurate parameter estimate and confidence interval.
Tremor activity has been recently detected in various tectonic areas world wide and is spatially segmented and temporally recurrent. We design a type of hidden Markov models to investigate this phenomenon, where each ...
详细信息
Tremor activity has been recently detected in various tectonic areas world wide and is spatially segmented and temporally recurrent. We design a type of hidden Markov models to investigate this phenomenon, where each state represents a distinct segment of tremor sources. A mixture distribution of a Bernoulli variable and a continuous variable is introduced into the hidden Markov model to solve the problem that tremor clusters are very sparse in time. We applied our model to the tremor data from the Tokai region in Japan to identify distinct segments of tremor source regions and the results reveal the spatiotemporal migration pattern among these segments.
Diagnostic tests are often compared in multi-reader multi-case (MRMC) studies in which a number of cases (subjects with or without the disease in question) are examined by several readers using all tests to be compare...
详细信息
Diagnostic tests are often compared in multi-reader multi-case (MRMC) studies in which a number of cases (subjects with or without the disease in question) are examined by several readers using all tests to be compared. One of the commonly used methods for analyzing MRMC data is the Obuchowski-Rockette (OR) method, which assumes that the true area under the receiver operating characteristic curve (AUC) for each combination of reader and test follows a linear mixed model with fixed effects for test and random effects for reader and the reader-test interaction. This article proposes generalized linear mixed models which generalize the OR model by incorporating a range-appropriate link function that constrains the true AUCs to the unit interval. The proposed models can be estimated by maximizing a pseudo-likelihood based on the approximate normality of AUC estimates. A Monte Carlo expectation-maximization algorithm can be used to maximize the pseudo-likelihood, and a non-parametric bootstrap procedure can be used for inference. The proposed method is evaluated in a simulation study and applied to an MRMC study of breast cancer detection.
In this article we propose a multiple-inflation Poisson regression to model count response data containing excessive frequencies at more than one non-negative integer values. To handle multiple excessive count respons...
详细信息
In this article we propose a multiple-inflation Poisson regression to model count response data containing excessive frequencies at more than one non-negative integer values. To handle multiple excessive count responses, we generalize the zero-inflated Poisson regression by replacing its binary regression with the multinomial regression, while Su et al. [Statist. Sinica 23 (2013) 1071-1090] proposed a multiple-inflation Poisson model for consecutive count responses with excessive frequencies. We give several properties of our proposed model, and do statistical inference under the fully Bayesian framework. We perform simulation studies and also analyze the data related to the number of infections collected in five major hospitals in Turkey, using our methodology.
Background: High-throughput assays are widely used in biological research to select potential targets. One single high-throughput experiment can efficiently study a large number of candidates simultaneously, but is su...
详细信息
Background: High-throughput assays are widely used in biological research to select potential targets. One single high-throughput experiment can efficiently study a large number of candidates simultaneously, but is subject to substantial variability. Therefore it is scientifically important to performance quantitative reproducibility analysis to identify reproducible targets with consistent and significant signals across replicate experiments. A few methods exist, but all have limitations. Methods: In this paper, we propose a new method for identifying reproducible targets. Considering a Bayesian hierarchical model, we show that the test statistics from replicate experiments follow a mixture of multivariate Gaussian distributions, with the one component with zero-mean representing the irreproducible targets. Results: A target is thus classified as reproducible or irreproducible based on its posterior probability belonging to the reproducible components. We study the performance of our proposed method using simulations and a real data example. Conclusion: The proposed method is shown to have favorable performance in identifying reproducible targets compared to other methods.
Deciding the number of clusters k is one of the most difficult problems in cluster analysis. For this purpose, complexity-penalized likelihood approaches have been introduced in model-based clustering, such as the wel...
详细信息
Deciding the number of clusters k is one of the most difficult problems in cluster analysis. For this purpose, complexity-penalized likelihood approaches have been introduced in model-based clustering, such as the well-known Bayesian information criterion and integrated complete likelihood criteria. However, the classification/mixture likelihoods considered in these approaches are unbounded without any constraint on the cluster scatter matrices. Constraints also prevent traditional em and Cem algorithms from being trapped in (spurious) local maxima. Controlling the maximal ratio between the eigenvalues of the scatter matrices to be smaller than a fixed constant c >= 1 is a sensible idea for setting such constraints. A new penalized likelihood criterion which takes into account the higher model complexity that a higher value of c entails is proposed. Based on this criterion, a novel and fully automated procedure, leading to a small ranked list of optimal (k, c) couples is provided. A new plot called "car-bike," which provides a concise summary of the solutions, is introduced. The performance of the procedure is assessed both in empirical examples and through a simulation study as a function of cluster overlap. Supplementary materials for the article are available online.
Non-negative matrix factorization (NMF) is a technique of multivariate analysis used to approximate a given matrix containing non-negative data using two non-negative factor matrices that has been applied to a number ...
详细信息
Non-negative matrix factorization (NMF) is a technique of multivariate analysis used to approximate a given matrix containing non-negative data using two non-negative factor matrices that has been applied to a number of fields. However, when a matrix containing non-negative data has many zeroes, NMF encounters an approximation difficulty. This zero-inflated situation occurs often when a data matrix is given as count data, and becomes more challenging with matrices of increasing size. To solve this problem, we propose a new NMF model for zero-inflated non-negative matrices. Our model is based on the zero-inflated Tweedie distribution. The Tweedie distribution is a generalization of the normal, the Poisson, and the gamma distributions, and differs from each of the other distributions in the degree of robustness of its estimated parameters. In this paper, we show through numerical examples that the proposed model is superior to the basic NMF model in terms of approximation of zero-inflated data. Furthermore, we show the differences between the estimated basis vectors found using the basic and the proposed NMF models for divergence by applying it to real purchasing data.
暂无评论