We propose a method for estimating parameters in binomial regression models when the response variable is missing and the missing data mechanism is nonignorable. We assume throughout that the covariates are fully obse...
详细信息
We propose a method for estimating parameters in binomial regression models when the response variable is missing and the missing data mechanism is nonignorable. We assume throughout that the covariates are fully observed. Using a legit model for the missing data mechanism, we show how parameter estimation can be accomplished using the emalgorithm by the method of weights proposed in Ibrahim (1990, Journal of the American Statistical Association 85, 765-769). An example from the Six Cities Study (Ware et al., 1984, American Review of Respiratory Diseases 129, 366-374) is presented to illustrate the method.
Influence diagnostics methods are extended in this article to the Grubbs model when the unknown quantity x (latent variable) follows a skew-normal distribution. Diagnostic measures are derived from the case-deletion a...
详细信息
Influence diagnostics methods are extended in this article to the Grubbs model when the unknown quantity x (latent variable) follows a skew-normal distribution. Diagnostic measures are derived from the case-deletion approach and the local influence approach under several perturbation schemes. The observed information matrix to the postulated model and Delta matrices to the corresponding perturbed models are derived. Results obtained for one real data set are reported, illustrating the usefulness of the proposed methodology.
The pace in the development and adoption of the new technologies for bigdata analytics has changed dramatically over the last several decades, and the amount of data being digitally ingested and stored is expanding ex...
详细信息
The pace in the development and adoption of the new technologies for bigdata analytics has changed dramatically over the last several decades, and the amount of data being digitally ingested and stored is expanding exponentially and rapidly. These data include structured, semi-structured and unstructured, and come in different sizes and formats. To utilize these vast resources, the knowledge and the skills needed to manage and to convert it into information is crucial. In this paper, firstly, the commonly used technologies, platforms, computational tools and the techniques currently in use for the ingesting, processing, storing and analyzing bigdata are reviewed. Secondly, those technologies are utilized to predict internet congestion by employing the bivariate mixture transition distribution (BMTD), expectation-maximization (em) algorithm and the autoregressive integrated moving average (ARIMA) models. BMTD models are very effective in capturing non-Gaussian and nonlinear features, such as bursts of activity and outliers, in a single unified model class. These models do not assume equally spaced, as well as independence, which are the key weaknesses of some other available time series and marked point processes models. Both the Weibull BMTD and the ARIMA models are very effective time series predictive models, but the comparison of their predictive performances is not yet addressed in the statistics and the machine learning literature.
An overview of a statistical paradigm for speech recognition is given where phonetic and phonological knowledge sources, drawn from the current understanding of the global characteristics of human speech communication...
详细信息
An overview of a statistical paradigm for speech recognition is given where phonetic and phonological knowledge sources, drawn from the current understanding of the global characteristics of human speech communication, are seamlessly integrated into the structure of a stochastic model of speech. A consistent statistical formalism is presented in which the submodels for the discrete, feature-based phonological process and the continuous, dynamic phonetic process in human speech production are computationally interfaced. This interface enables global optimization of a parsimonious set of model parameters that accurately characterize the symbolic, dynamic, and static components in speech production and explicitly separates distinct sources of the speech variability observable at the acoustic level. The formalism is founded on a rigorous mathematical basis, encompassing computational phonology, Bayesian analysis and statistical estimation theory, nonstationary time series and dynamic system theory, and nonlinear function approximation (neural network) theory. Two principal ways of implementing the speech model and recognizer are presented, one based on the trended hidden Markov model (HMM) or explicitly defined trajectory model, and the other on the state-space or recursively defined trajectory model. Both implementations build into their respective recognition and model-training algorithms a continuity constraint on the internal, production-affiliate trajectories across feature-defined phonological units. The continuity and the parameterized structure in the dynamic speech model permit a joint characterization of the contextual and speaking-style variations manifested in speech acoustics, thereby holding promises to overcome some key limitations of the current speech recognition technology (C) 1998 Elsevier Science B.V. All rights reserved.
In this work, we extend standard likelihood-based procedures to the multivariate linear model using the scale mixtures of multivariate skew-normal-Cauchy distributions. A simple emalgorithm for iteratively computing ...
详细信息
In this work, we extend standard likelihood-based procedures to the multivariate linear model using the scale mixtures of multivariate skew-normal-Cauchy distributions. A simple emalgorithm for iteratively computing maximum likelihood estimates is derived. The observed information matrix is computed analytically to account for standard errors. Some results are obtained from real and simulated datasets to illustrate the usefulness of the proposed model.
A problem that frequently occurs in biological experiments with laboratory animals is that some subjects are less susceptible to the treatment group than others. Finite mixture models have traditionally been used to d...
详细信息
A problem that frequently occurs in biological experiments with laboratory animals is that some subjects are less susceptible to the treatment group than others. Finite mixture models have traditionally been used to describe the distribution of responses in treated subjects for such studies. In this paper, we first study the mixture normal model with multi-levels and multiple mixture sub-populations under each level, with particular attention being given to the model in which the proportions of susceptibility are related to dose levels, then we use em-algorithm to find the maximum likelihood estimators of model parameters. Our results are generalizations of the existing results. Finally, we illustrate realistic significance of the above extension based on a set of real dose-response data.
In this paper, by proposing a two-stage segmentation method based on active contour model, we improve the procedure of former image segmentation methods. The first stage of our method is computing weights, means and v...
详细信息
In this paper, by proposing a two-stage segmentation method based on active contour model, we improve the procedure of former image segmentation methods. The first stage of our method is computing weights, means and variances of image by utilizing Mixture of Gaussian distribution which parameters are obtained from em-algorithm. Once they are obtained, in the second stage, by incorporating level set method for minimizing energy function, the segmentation is achieved. We use an adaptive direction function to make the curve evolution robust against the curves initial position and a nonlinear adaptive velocity to speed up the process of curve evolution and also a probability-weighted edge and region indicator function to implement a robust segmentation for objects with weak boundaries. The paper consists of minimizing a functional containing a penalty term in an attempt to maintain the signed distance property in the entire domain and an external energy term such that it achieves a minimum when the zero level set of the function is located at desired position. (C) 2016 Elsevier Inc. All rights reserved.
A new class of survival frailty models based on the generalized inverse-Gaussian (GIG) distributions is proposed. We show that the GIG frailty models are flexible and mathematically convenient like the popular gamma f...
详细信息
A new class of survival frailty models based on the generalized inverse-Gaussian (GIG) distributions is proposed. We show that the GIG frailty models are flexible and mathematically convenient like the popular gamma frailty model. A piecewise-exponential baseline hazard function is employed, yielding flexibility for the proposed class. Although a closed-form observed log-likelihood function is available, simulation studies show that employing an em-algorithm is advantageous concerning the direct maximization of this function. Further simulated results address the comparison of different methods for obtaining standard errors of the estimates and confidence intervals for the parameters. Additionally, the finite-sample behavior of the em-estimators is investigated and the performance of the GIG models under misspecification assessed. We apply our methodology to a TARGET (Therapeutically Applicable Research to Generate Effective Treatments) data about the survival time of patients with neuroblastoma cancer and show some advantages of the GIG frailties over existing models in the literature.
In this work, we have defined a new family of skew distribution: the Skew-Reflected-Gompertz. We have also derived some of its probabilistic and inferential properties. The maximum likelihood estimates of the proposed...
详细信息
In this work, we have defined a new family of skew distribution: the Skew-Reflected-Gompertz. We have also derived some of its probabilistic and inferential properties. The maximum likelihood estimates of the proposed distribution parameters are obtained via an em-algorithm, and performances of the proposed model and its estimates are shown via simulation studies as well as real applications. Three real datasets are also used to illustrate the model performance which can compete against some well-known skew distributions frequently used in applications. (C) 2018 Elsevier B.V. All rights reserved.
This paper proposes a generalization of the Vector Taylor Series (VTS) approach for the compensation of speech feature distortions. It uses a phase term aware representation of the speech distortion model. It consider...
详细信息
This paper proposes a generalization of the Vector Taylor Series (VTS) approach for the compensation of speech feature distortions. It uses a phase term aware representation of the speech distortion model. It considers this term as a Gaussian random vector with unknown parameters in the same manner as it is conventionally done for additive noise. These parameters are estimated by means of the em-algorithm. The explicit expressions for parameters update are derived. The minimum mean square error (MMSE) estimate of clean speech features is also obtained. Experiments carried out on the Aurora2 and Aurora4 databases show that the proposed approach outperforms the phase-insensitive version of feature-space VTS significantly for both GMM and DNN acoustic models. It is also shown that the combination of the proposed approach with the cepstral mean normalization (CMN) provides additional accuracy gains. (C) 2017 Elsevier B.V. All rights reserved.
暂无评论