In many chemical industries, a production line usually produces various products with different grades to meet the demands of the worldwide market. A process with multiple grades is not suitable to be described using ...
详细信息
In many chemical industries, a production line usually produces various products with different grades to meet the demands of the worldwide market. A process with multiple grades is not suitable to be described using a traditional single model. In this paper, a multi-grade principal component analysis (MGPCA) model is proposed for multi-grade process modeling and fault detection purposes. The proposed MGPCA can use the measurements from different grades with unequal sizes and to extract the essential information from the multi-grade process. The model is derived in a probabilistic framework and the corresponding parameters are estimated by the expectation-maximization algorithm. Finally, a simulated case and a real industrial polyethylene process with multiple grades are tested to evaluate the property of the proposed method.
This article studies the dependence of spatial linear models using a slash distribution with a finite second moment. The parameters of the model are estimated with maximum likelihood by using the em algorithm. To avoi...
详细信息
This article studies the dependence of spatial linear models using a slash distribution with a finite second moment. The parameters of the model are estimated with maximum likelihood by using the em algorithm. To avoid identifiability problems, the cross-validation, the Trace and the maximum log-likelihood value are used to choose the parameter for adjusting the kurtosis of the slash distribution and the selection of the model to explain the spatial dependence. We present diagnostic techniques of global and local influences for exploring the sensibility of estimators and the presence of possible influential observations. A simulation study is developed to determine the performance of the methodology. The results showed the effectiveness of the choice criteria of the parameter for adjusting the kurtosis and for the selection of the spatial dependence model. It has also showed that the slash distribution provides an increased robustness to the presence of influential observations. As an illustration, the proposed model and its diagnostics are used to analyze an aquifer data. The spatial prediction with and without the influential observations were compared. The results show that the contours of the interpolation maps and prediction standard error maps showed low changes when we removed the influential observations. Thus, this model is a robust alternative in the spatial linear modeling for dependent random variables. Supplementary materials accompanying this paper appear online.
Nowadays, online product reviews play a crucial role in the purchase decision of consumers. A high proportion of positive reviews will bring substantial sales growth, while negative reviews will cause sales loss. Driv...
详细信息
Nowadays, online product reviews play a crucial role in the purchase decision of consumers. A high proportion of positive reviews will bring substantial sales growth, while negative reviews will cause sales loss. Driven by the immense financial profits, many spammers try to promote their products or demote their competitors' products by posting fake and biased online reviews. By registering a number of accounts or releasing tasks in crowdsourcing platforms, many individual spammers could be organized as spammer groups to manipulate the product reviews together and can be more damaging. Existing works on spammer group detection extract spammer group candidates from review data and identify the real spammer groups using unsupervised spamicity ranking methods. Actually, according to the previous research, labeling a small number of spammer groups is easier than one assumes, however, few methods try to make good use of these important labeled data. In this paper, we propose a partially supervised learning model (PSGD) to detect spammer groups. By labeling some spammer groups as positive instances, PSGD applies positive unlabeled learning (PU-Learning) to study a classifier as spammer group detector from positive instances (labeled spammer groups) and unlabeled instances (unlabeled groups). Specifically, we extract reliable negative set in terms of the positive instances and the distinctive features. By combining the positive instances, extracted negative instances and unlabeled instances, we convert the PU-Learning problem into the well-known semi supervised learning problem, and then use a Naive Bayesian model and an em algorithm to train a classifier for spammer group detection. Experiments on real-life *** data set show that the proposed PSGD is effective and outperforms the state-of-the-art spammer group detection methods.
Finite mixture models have provided a reasonable tool to model various types of observed phenomena, specially those which are random in nature. In this article, a finite mixture of Weibull and Pareto (IV) distribution...
详细信息
Finite mixture models have provided a reasonable tool to model various types of observed phenomena, specially those which are random in nature. In this article, a finite mixture of Weibull and Pareto (IV) distribution is considered and studied. Some structural properties of the resulting model are discussed including estimation of the model parameters via expectation maximization (em) algorithm. A real-life data application exhibits the fact that in certain situations, this mixture model might be a better alternative than the rival popular models.
We introduce the best unbiased prediction of missing order statistics of a stable distribution, based on conditional expected value. We present necessary and sufficient conditions for the existence of conditional mome...
详细信息
We introduce the best unbiased prediction of missing order statistics of a stable distribution, based on conditional expected value. We present necessary and sufficient conditions for the existence of conditional moments of stable order statistics. These conditions enable us to compute unknown parameters using the expectation-maximization algorithm. We reveal the efficiency of the presented method through a simulation study.
In this letter, we exploit the feature of data redundancy associated with alternate-relaying cooperative systems to develop an iterative channel estimation algorithm in the context of orthogonal frequency division mul...
详细信息
In this letter, we exploit the feature of data redundancy associated with alternate-relaying cooperative systems to develop an iterative channel estimation algorithm in the context of orthogonal frequency division multiplexing (OFDM) transmission. Our attention is also focused on the problem of in-phase/quadrature-phase (IQ) imbalance which is typically associated with OFDM transmission. Analytical analysis indicates that instead of estimating a family of parameters including IQ imbalance occurring at the source, relays, and destination, and channel impulse responses (CIRs) between the source-destination link, and relays-destination links, we can estimate one parameter called the equivalent CIR. In addition, we illustrate how to perform data detection using the estimated parameter. By employing expectation-maximization algorithm, we show that soft information provided by the detector can be combined with pilot symbols in an efficient way to enhance the estimation process. Simulations experiments have confirmed the efficiency of the proposed approach.
Gupta and Kundu (Statistics 43 (2009) 621-643) introduced a new class of weighted exponential distribution and established its several properties. The probability density function of the proposed weighted exponential ...
详细信息
Gupta and Kundu (Statistics 43 (2009) 621-643) introduced a new class of weighted exponential distribution and established its several properties. The probability density function of the proposed weighted exponential distribution is unimodal and it has an increasing hazard function. Following the same line Shahbaz, Shahbaz and Butt (Pak. J. Stat. Oper. Res. VI (2010) 53-59) introduced weighted Weibull distribution, and we derive several new properties of this weighted Weibull distribution. The main aim of this paper is to introduce bivariate and multivariate distributions with weighted Weibull marginals and establish their several properties. It is shown that the hazard function of the weighted Weibull distribution can have increasing, decreasing and inverted bathtub shapes. The proposed multivariate model has been obtained as a hidden truncation model similarly as the univariate weighted Weibull model. It is observed that to compute the maximum likelihood estimators of the unknown parameters for the proposed p-variate distribution, one needs to solve (p + 2) non-linear equations. We propose to use the em algorithm to compute the maximum likelihood estimators of the unknown parameters. We obtain the observed Fisher information matrix, which can be used for constructing asymptotic confidence intervals. One data analysis has been performed for illustrative purposes, and it is observed that the proposed em algorithm is very easy to implement, and the performance is quite satisfactory.
Simplex distribution has been proved useful for modelling double-bounded variables in data directly. Yet, it is not sufficient for multimodal distributions. This article addresses the problem of estimating a density w...
详细信息
Simplex distribution has been proved useful for modelling double-bounded variables in data directly. Yet, it is not sufficient for multimodal distributions. This article addresses the problem of estimating a density when data is restricted to the (0,1) interval and contains several modes. Particularly, we propose a simplex mixture model approach to model this kind of data. In order to estimate the parameters of the model, an Expectation Maximization (em) algorithm is developed. The parameter estimation performance is evaluated through simulation studies. Models are explored using two real datasets: i) gene expressions data of patients' survival times and the relation to adenocarcinoma and ii) magnetic resonant images (MRI) with a view in segmentation. In the latter case, given that data contains zeros, the main model is modified to consider the zero-inflated setting.
Replicated data with measurement errors are frequently presented in economical, environmental, chemical, medical and other fields. In this paper, we discuss a replicated measurement error model under the class of scal...
详细信息
Replicated data with measurement errors are frequently presented in economical, environmental, chemical, medical and other fields. In this paper, we discuss a replicated measurement error model under the class of scale mixtures of skew-normal distributions, which extends symmetric heavy and light tailed distributions to asymmetric cases. We also consider equation error in the model for displaying the matching degree between the true covariate and response. Explicit iterative expressions of maximum likelihood estimates are provided via the expectation-maximization type algorithm. empirical Bayes estimates are conducted for predicting the true covariate and response. We study the effectiveness as well as the robustness of the maximum likelihood estimations through two simulation studies. The method is applied to analyze a continuing survey data of food intakes by individuals on diet habits.
The short-term distribution of wave periods is very important for ocean and coastal engineering applications. At present, the vast majority of research studies are confined to single-wave systems. Most available theor...
详细信息
The short-term distribution of wave periods is very important for ocean and coastal engineering applications. At present, the vast majority of research studies are confined to single-wave systems. Most available theoretical distributions of wave periods, which are based on the narrowband approximation, are inapplicable to actual sea states. This study focuses on the probability distribution of individual wave periods in combined sea states with two parametric mixture distribution models. The expectation-maximisation (em) algorithm is used to calculate the maximum likelihood estimates of the mixture models. Further, the mixture distributions are compared with other two models: a theoretical and a parametric model. In situ-measured data with two-peaked spectra and simulated data obtained with the six-parameter Ochi-Hubble model allow for a thorough assessment of the distribution models. The patterns of the distributions of wave periods in nine types of mixed sea states are considered and discussed. According to the results, the theoretical distribution model is unsuitable for the description of the distributions in mixed sea states;in particular, when the patterns exhibit bimodal characters. By contrast, despite having a higher calculation complexity, the mixture distribution models provide an improved performance for all combined-sea state cases.
暂无评论