This paper investigates on the problem of parameter estimation in statistical model when observations are interval assumed to be related to underlying crisp realizations of a random sample. The proposed approach relie...
详细信息
ISBN:
(纸本)9789881925169
This paper investigates on the problem of parameter estimation in statistical model when observations are interval assumed to be related to underlying crisp realizations of a random sample. The proposed approach relies on the extension of likelihood function in interval setting. A maximum likelihood estimate (MLE) of the parameter of interest can then be defined as a crisp value maximizing the generalized likelihood function. Using the Expectation- Maximization (em) to solve such maximizing problem therefore derives the so-called interval-valued em algorithm (Iem), which makes it possible to solve a wide range of statistical problems involving interval-valued data. As an illustration, the Iem is used to estimate the parameters mean and variance of a univariate normal distribution from interval-valued samples.
In this paper, an unsupervised change detection method based on space contextual information and em algorithm is proposed. In the algorithm, each pixel of the difference image is represented by a characteristic quanti...
详细信息
ISBN:
(纸本)9781467311595
In this paper, an unsupervised change detection method based on space contextual information and em algorithm is proposed. In the algorithm, each pixel of the difference image is represented by a characteristic quantity constructed from the difference image values considering the space contextual information. em algorithm is used to achieve the parameter estimation of each class pixels. Bayesian inference is then employed to perform the final change detection results. Experimental results obtained on multi-temporal optical images acquired by Landsat 5 TM confirm the effectiveness of the proposed approach.
The segmentation of color image is an important research field of image processing and pattern recognition. A color image could be considered as the result from Gaussian mixture model (GMM) to which several Gaussian r...
详细信息
ISBN:
(纸本)9783642352850
The segmentation of color image is an important research field of image processing and pattern recognition. A color image could be considered as the result from Gaussian mixture model (GMM) to which several Gaussian random variables contribute. In this paper, an efficient method of image segmentation is proposed. The method uses Gaussian mixture models to model the original image, and transforms segmentation problem into the maximum likelihood parameter estimation by expectation-maximization (em) algorithm. And using the method to classify their pixels of the image, the problem of color image segmentation can be resolved to some extent. The experiment results confirm this method validity.
Automatic unsupervised clustering of white matter fiber tracts is necessary for the group analysis of brain neural network integrity using diffusion tensor imaging (DTI) and DTI tractography techniques. In this paper,...
详细信息
ISBN:
(纸本)9781467356664;9781467356657
Automatic unsupervised clustering of white matter fiber tracts is necessary for the group analysis of brain neural network integrity using diffusion tensor imaging (DTI) and DTI tractography techniques. In this paper, we present an implementation of the expectation-maximization (em) algorithm to conduct the automatic unsupervised clustering of reconstructed white matter fiber tracts in DTI. The statistical model is the multivariate and multi-mode Gaussian mixture which depicts the probability distribution of white matter fiber tracts. Issues related to the parameter estimation, initialization, and convergence for applying the em algorithm in the context of the white matter fiber tract clustering are discussed in detail. Comparisons of the em algorithm with the K-means approach are also performed. The difference between the fixed and variant prior probabilities on the clustering result as the em algorithm proceeds is demonstrated by experiments. Real DTI datasets are used to evaluate the performance of the proposed method. Experimental results show that the proposed approach is feasible and may be useful in the automatic unsupervised clustering of white matter fiber tracts.
For the Gaussian mixture learning, the expectation-maximization (em) algorithm as well as its modified versions are widely used, but there are still two major limitations: (i). the number of components or Gaussians mu...
详细信息
ISBN:
(纸本)9783037853511
For the Gaussian mixture learning, the expectation-maximization (em) algorithm as well as its modified versions are widely used, but there are still two major limitations: (i). the number of components or Gaussians must be known in advance, and (ii). There is no generally accepted method for parameters initialization to prevent the algorithm being trapped in one of the local maxima of the likelihood function. In order to overcome these weaknesses, we proposed a greedy em algorithm based on a kurtosis and skewness criterion. Specifically, we start with a single component and add one component step by step under the framework of em algorithm in order to decrease the value of the kurtosis and skewness measure which provides an efficient index to show how well the Gaussian mixture model fits the sample data. In such a way, the number of components can be selected adaptively during the em learning and the learning parameters can possibly escape from local maxima.
This paper extends the expectation-maximization (em) algorithm to estimate not only optimal acoustic model parameters, but also optimal center frequencies and bandwidths of the filter bank used in cepstral feature ext...
详细信息
ISBN:
(纸本)9781467300469
This paper extends the expectation-maximization (em) algorithm to estimate not only optimal acoustic model parameters, but also optimal center frequencies and bandwidths of the filter bank used in cepstral feature extraction for bird call classification. The search is done using the gradient ascent method. Filter bank and model parameters are optimized iteratively. Experiments are conducted on a large noisy corpus containing Antbird calls from 5 species. It is shown that features extracted using the optimized filter bank result in a lower classification error rate than those extracted using a Mel-scaled filter bank.
A method based on Expectation Maximization (em) algorithm and Gibbs sampling is proposed to estimate Bayesian networks (BNs) parameters. We employ the Gibbs sampling to approximate the E-step of em algorithm. Accordin...
详细信息
A method based on Expectation Maximization (em) algorithm and Gibbs sampling is proposed to estimate Bayesian networks (BNs) parameters. We employ the Gibbs sampling to approximate the E-step of em algorithm. According to transition probability, Gibbs sampling is utilized in data completion of E-step, which can reduce the computational complexity of em algorithm. The experiments for comparison between the proposed method and em algorithm are made. For the proposed method, the consumed time and the number of iterations are all less than those of em algorithm. However, the KL divergence is higher than that of em algorithm, which is a limitation for the proposed method. (C) 2017 The Authors. Published by Elsevier B.V.
In this paper we extend and improve a recently proposed method for the measurement of the cultural distance between strata. In the original version, strata meant countries: respondents from different countries were cl...
详细信息
In this paper we extend and improve a recently proposed method for the measurement of the cultural distance between strata. In the original version, strata meant countries: respondents from different countries were clustered on the basis of their answers to a set of questions and their resulting distribution among the K clusters thus formed was used to calculate the "position" of each country as a point in a K-dimensional space. The proposed improvements are as follows. First, the notion of "strata" is enlarged: not only geographic units, but also gender, age, education, religious attitudes and rural/urban residence. Second, clustering is now based on em, or Expectation Maximization, which automatically determines the optimal number of clusters, thus overcoming one of the major limitations of the previous version of the method. Third, since this optimal number of clusters turns out to be small, a principal component analysis is used to capture most of the variability and draw a very telling, two-dimensional representation of how (culturally) distant strata are from one another. Fourth, since two types of distances between strata can be computed, a cultural and an "objective" one (e.g., kilometers between regions or years between age groups), their correlation can be calculated. On our Istat (Indagine multiscopo, Aspetti della vita quotidiana, Rome, 2013) data, expectations are confirmed: the farther strata are, the greater their cultural distance. The same happens for the (rural/urban) type of commune of residence. Religion, instead, is rarely, and gender is never, associated to any measurable cultural difference.
BackgroundUnsupervised clustering represents one of the most widely applied methods in analysis of high-throughput omics data. A variety of unsupervised model-based or parametric clustering methods and non-parametric ...
详细信息
BackgroundUnsupervised clustering represents one of the most widely applied methods in analysis of high-throughput omics data. A variety of unsupervised model-based or parametric clustering methods and non-parametric clustering methods have been proposed for RNA-seq count data, most of which perform well for large samples, e.g. N500. A common issue when analyzing limited samples of RNA-seq count data is that the data follows an over-dispersed distribution, and thus a Negative Binomial likelihood model is often used. Thus, we have developed a Negative Binomial model-based (NBMB) clustering approach for application to RNA-seq *** have developed a Negative Binomial Model-Based (NBMB) method to cluster samples using a stochastic version of the expectation-maximization algorithm. A simulation study involving various scenarios was completed to compare the performance of NBMB to Gaussian model-based or Gaussian mixture modeling (GMM). NBMB was also applied for the clustering of two RNA-seq studies;type 2 diabetes study (N=96) and TCGA study of ovarian cancer (N=295). Simulation results showed that NBMB outperforms GMM applied with different transformations in majority of scenarios with limited sample size. Additionally, we found that NBMB outperformed GMM for small clusters distance regardless of sample size. Increasing total number of genes with fixed proportion of differentially expressed genes does not change the outperformance of NBMB, but improves the overall performance of GMM. Analysis of type 2 diabetes and ovarian cancer tumor data with NBMB found good agreement with the reported disease subtypes and the gene expression patterns. This method is available in an R package on CRAN named *** of Negative Binomial model based clustering is advisable when clustering over dispersed RNA-seq count data.
Parameters of a finite mixture model are often estimated by the expectation-maximization (em) algorithm where the observed data log-likelihood function is maximized. This paper proposes an alternative approach for fit...
详细信息
Parameters of a finite mixture model are often estimated by the expectation-maximization (em) algorithm where the observed data log-likelihood function is maximized. This paper proposes an alternative approach for fitting finite mixture models. Our method, called the iterativemonte Carlo classification (IMCC), is also an iterative fitting procedure. Within each iteration, it first estimates the membership probabilities for each data point, namely the conditional probability of a data point belonging to a particular mixing component given that the data point value is obtained, it then classifies each data point into a component distribution using the estimated conditional probabilities and the Monte Carlo method. It finally updates the parameters of each component distribution based on the classified data. Simulation studies were conducted to compare IMCC with some other algorithms for fittingmixture normal, and mixture t, densities.
暂无评论