The estimation of parameters is a key component in statistical modelling and inference. However, parametrization of certain likelihood functions could lead to highly correlated estimates, causing numerical problems, m...
详细信息
The estimation of parameters is a key component in statistical modelling and inference. However, parametrization of certain likelihood functions could lead to highly correlated estimates, causing numerical problems, mathematical complexities and difficulty in estimation or erroneous interpretation and subsequently inference. In statistical estimation, the concept of orthogonalization is familiar as a simplifying technique that allows parameters to be estimated independently and thus separates information from each other. We introduce a fisherscoring iterative process that incorporates the Gram-Schmidt orthogonalization technique for maximum likelihood estimation. A finite mixture model for correlated binary data is used to illustrate the implementation of the method with discussion of application to oesophageal cancer data.
To model count data with excess zeros, ones and twos, for the first time we introduce a so-called zero-one-two-inflated Poisson (ZOTIP) distribution, including the zero-inflated Poisson (ZIP) and the zero-and-one-infl...
详细信息
To model count data with excess zeros, ones and twos, for the first time we introduce a so-called zero-one-two-inflated Poisson (ZOTIP) distribution, including the zero-inflated Poisson (ZIP) and the zero-and-one-inflated Poisson (ZOIP) distributions as two special cases. We establish three equivalent stochastic representations for the ZOTIP random variable to develop important distributional properties of the ZOTIP distribution. The fisherscoring and expectation-maximization (EM) algorithms are derived to obtain the maximum likelihood estimates of parameters of interest. Bootstrap confidence intervals are also provided. Testing hypotheses are considered, simulation studies are conducted, and two real data sets are used to illustrate the proposed methods.
In the current estimation of a GLM model, the correlation structure of regressors is not used as the basis on which to lean strong predictive dimensions. Looking for linear combinations of regressors that merely maxim...
详细信息
In the current estimation of a GLM model, the correlation structure of regressors is not used as the basis on which to lean strong predictive dimensions. Looking for linear combinations of regressors that merely maximize the likelihood of the GLM has two major consequences: (1) collinearity of regressors is a factor of estimation instability, and (2) as predictive dimensions may lean on noise, both predictive and explanatory powers of the model are jeopardized. For a single dependent variable, attempts have been made to adapt PLS regression, which solves this problem in the classical Linear Model, to GLM estimation. In this paper, we first discuss the methods thus developed, and then propose a technique, Supervised Component Generalized Linear Regression (SCGLR), that combines PLS regression with GLM estimation in the multivariate context. SCGLR is tested on both simulated and real data. (C) 2013 Elsevier Inc. All rights reserved.
In this paper, a new flexible count regression analysis is proposed. For this purpose, a new modification of the Poisson distribution is introduced which generalizes the Poisson, zero-inflated Poisson, zero-one inflat...
详细信息
In this paper, a new flexible count regression analysis is proposed. For this purpose, a new modification of the Poisson distribution is introduced which generalizes the Poisson, zero-inflated Poisson, zero-one inflated Poisson, and zero-one-two inflated Poisson distributions. Some distributional properties are discussed for the proposed distribution. The fisherscoring and EM algorithms are derived to attain the maximum likelihood estimates of the unknown parameters. An expected fisher information matrix is provided to construct an approximate confidence interval for the parameters. Using the modified Poisson distribution, an arbitrary multiply inflated counting regression model is proposed. The performance of the maximum likelihood methodology is investigated with a simulation study for the distribution and count regression model. Finally, two practical data sets are analyzed and the superiority of the proposed model is demonstrated among others.
Discriminative models have been shown to be more advantageous for pattern recognition problem in machine learning. For this study, the main focus is developing a new hybrid model that combines the advantages of a disc...
详细信息
Discriminative models have been shown to be more advantageous for pattern recognition problem in machine learning. For this study, the main focus is developing a new hybrid model that combines the advantages of a discriminative technique namely the support vector machines (SVM) with the full efficiency offered through covariance multivariate generalized Gaussian mixture models (MGGMM). This new hybrid MGGMM applies the fisher and Kullback-Leibler kernels derived from MGGMM to improve the kernel function of SVM. This approach is based on two different learning techniques explicitly: the fisher scoring algorithm and the Bayes inference technique based on Markov Chain Monte Carlo and Metropolis-Hastings algorithm. These learning methods work with two model selection approaches (minimum message length and marginal likelihood) to determine the number of clusters. The effectiveness of the framework is demonstrated through extensive experiments including synthetic datasets, facial expression recognition and human activity recognition.
This paper presents the estimation procedures for a bivariate cointegration model when the errors are generated by a constant conditional correlation model. In particular, the method of maximum likelihood is discussed...
详细信息
This paper presents the estimation procedures for a bivariate cointegration model when the errors are generated by a constant conditional correlation model. In particular, the method of maximum likelihood is discussed when the errors follow Generalised Autoregressive Conditional Hetroskedastic (GARCH) models with Gaussian and some non Gaussian innovations. The method of estimation is illustrated using simulated observations. Data analysis is provided to highlight the applications of the proposed models.
A medical examination provides a key input into decisions about disability pension and other forms of income support or compensation that are justified on medical grounds. The result of examining an individual is ofte...
详细信息
A medical examination provides a key input into decisions about disability pension and other forms of income support or compensation that are justified on medical grounds. The result of examining an individual is often communicated by means of a score, and inflation of such scores is a well-known problem. We estimate the extent of inflation of scores from a set of disability assessments using a model based on the discrete linear distribution. We explore some extensions within the framework of a sensitivity analysis.
Measurement error in continuous, normally distributed data is well known in the literature. Measurement error in a binary outcome variable, however, remains under-studied. Misclassification is the error in categorical...
详细信息
Geographically weighted generalised linear models are an extension of the geographically weighted regression models in order to handle such types of the response variables that their distributions follow a member of t...
详细信息
Geographically weighted generalised linear models are an extension of the geographically weighted regression models in order to handle such types of the response variables that their distributions follow a member of the exponential family of distributions. In view of the advantages of the local-linear fitting technique, we propose in this paper a local-linear likelihood estimation approach for geographically weighted generalised linear models to improve the accuracy of the coefficient estimators. A fisher scoring algorithm is formulated to compute the estimators of the coefficients. Simulations are conducted for some typical geographically weighted generalised linear models to evaluate the performance of the proposed estimation method and the results show that, compared to the existing local-constant likelihood estimation, the local-linear likelihood method can evidently improve the accuracy of the coefficient estimators. A real-world data-set is finally analysed to demonstrate the application of the proposed approach.
Many statistical models require an estimation of unknown (co)-variance parameter(s). The estimation is usually obtained by maximizing a log-likelihood which involves log determinant terms. In principle, one requires t...
详细信息
ISBN:
(纸本)9781509051540
Many statistical models require an estimation of unknown (co)-variance parameter(s). The estimation is usually obtained by maximizing a log-likelihood which involves log determinant terms. In principle, one requires the observed information-the negative Hessian matrix or the second derivative of the log-likelihood-to obtain an accurate maximum likelihood estimator according to the Newton method. When one uses the fisher information, the expect value of the observed information, a simpler algorithm than the Newton method is obtained as the fisher scoring algorithm. With the advance in high-throughput technologies in the biological sciences, recommendation systems and social networks, the sizes of data sets-and the corresponding statistical models-have suddenly increased by several orders of magnitude. Neither the observed information nor the fisher information is easy to obtained for these big data sets. This paper introduces an information splitting technique to simplify the computation. After splitting the mean of the observed information and the fisher information, an simpler approximate Hessian matrix for the log-likelihood can be obtained. This approximated Hessian matrix can significantly reduce computations, and makes the linear mixed model applicable for big data sets. Such a spitting and simpler formulas heavily depend on matrix algebra transforms, and applicable to large scale breeding model, genetics wide association analysis.
暂无评论