This letter studies a new expectationmaximization (EM) algorithm to solve the problem of circle, sphere and more generally hypersphere fitting. This algorithm relies on the introduction of random latent vectors havin...
详细信息
This letter studies a new expectationmaximization (EM) algorithm to solve the problem of circle, sphere and more generally hypersphere fitting. This algorithm relies on the introduction of random latent vectors having a priori independent von Mises-Fisher distributions defined on the hypersphere. This statistical model leads to a complete data likelihood whose expected value, conditioned on the observed data, has a Von Mises-Fisher distribution. As a result, the inference problem can be solved with a simple EM algorithm. The performance of the resulting hypersphere fitting algorithm is evaluated for circle and sphere fitting.
This paper is concerned with the problem of parameter estimation for nonlinear Wiener systems in the stochastic framework. Based on the expectation-maximization (EM) algorithm in dealing with the incomplete data, it i...
详细信息
This paper is concerned with the problem of parameter estimation for nonlinear Wiener systems in the stochastic framework. Based on the expectation-maximization (EM) algorithm in dealing with the incomplete data, it is applied to estimate the parameters of nonlinear Wiener models considering the randomly missing outputs. By means of the EM approach, the parameters and the missing outputs can be estimated simultaneously. To obtain the noise-free output in the linear subsystem of the Wiener model, the auxiliary model identification idea is adopted here. The simulation results indicate the effectiveness of the proposed approach for identification of a class of nonlinear Wiener models.
Mixture of experts (ME) is a modular neural network architecture for supervised learning. This paper illustrates the use of ME network structure to guide model selection for classification of electroencephalogram (EEG...
详细信息
Mixture of experts (ME) is a modular neural network architecture for supervised learning. This paper illustrates the use of ME network structure to guide model selection for classification of electroencephalogram (EEG) signals. expectation-maximization (EM) algorithm was used for training the ME so that the learning process is decoupled in a manner that fits well with the modular structure. The EEG signals were decomposed into time-frequency representations using discrete wavelet transform and statistical features were calculated to depict their distribution. The ME network structure was implemented for classification of the EEG signals using the statistical features as inputs. To improve classification accuracy, the outputs of expert networks were combined by a gating network simultaneously trained in order to stochastically select the expert that is performing the best at solving the problem. Three types of EEG signals (EEG signals recorded from healthy volunteers with eyes open, epilepsy patients in the epileptogenic zone during a seizure-free interval, and epilepsy patients during epileptic seizures) were classified with the accuracy of 93.17% by the ME network structure. The ME network structure achieved accuracy rates which were higher than that of the stand-alone neural network models. (c) 2007 Elsevier Ltd. All rights reserved.
An expectationmaximization (EM) algorithm for factor analysis parameter estimation when observations are missing is developed. In contrast to existing EM algorithms for this problem, the algorithm here is developed a...
详细信息
An expectationmaximization (EM) algorithm for factor analysis parameter estimation when observations are missing is developed. In contrast to existing EM algorithms for this problem, the algorithm here is developed assuming the missing observations are not part of the complete data in the EM formulation. The resulting algorithm provides increased computational efficiency through sparse matrix operations. The algorithm is demonstrated on two sparse, high-dimensional data sets that are prohibitively large for existing algorithms: the Netflix movie recommendation data set and the Yahoo! musical item data set. The resulting factor models are applied to predict missing values using conditional mean estimation, achieving root mean square errors of 0.9001 and 24.08 on the Netflix and Yahoo! data sets, respectively. (C) 2013 Elsevier B.V. All rights reserved.
We use direct numerical simulation data to study the identification of coherent vortical structures that generate strong scalar flux at the free surface of an open-channel turbulent flow. Using conventional conditiona...
详细信息
We use direct numerical simulation data to study the identification of coherent vortical structures that generate strong scalar flux at the free surface of an open-channel turbulent flow. Using conventional conditional averaging of events with strong scalar surface flux or large vorticity components, we characterize the correlation of surface flux with a variety of subsurface vortical structures. We then present a clustering method based on the expectation-maximization algorithm which is shown to be effective in identifying dominant turbulence structure patterns. Using this method, clustering modes are obtained for different characteristic vorticity distributions on spanwise and streamwise vertical planes. It is found that each clustering mode can be constructed by a linear combination of a small number of enstrophy-containing eigenvectors obtained by proper orthogonal decomposition (POD). Compared with the POD eigenvectors, the clustering modes have a more direct correspondence to the turbulence structures in physical space. It is shown that ring-like and asymmetric cane vortices are the dominant vortical structures related to strong scalar surface flux in open-channel flow. The clustering method is general and can also be used for other types of flows and for applications beyond interfacial scalar transport. (C) 2012 Elsevier Ltd. All rights reserved.
In this study, we applied Bayesian networks to prioritize the factors that influence hazardous material (Hazmat) transportation accidents. The Bayesian network structure was built based on expert knowledge using Demps...
详细信息
In this study, we applied Bayesian networks to prioritize the factors that influence hazardous material (Hazmat) transportation accidents. The Bayesian network structure was built based on expert knowledge using Dempster-Shafer evidence theory, and the structure was modified based on a test for conditional independence. We collected and analyzed 94 cases of Chinese Hazmat transportation accidents to compute the posterior probability of each factor using the expectation-maximization learning algorithm. We found that the three most influential factors in Hazmat transportation accidents were human factors, the transport vehicle and facilities, and packing and loading of the Hazmat. These findings provide an empirically supported theoretical basis for Hazmat transportation corporations to take corrective and preventative measures to reduce the risk of accidents. (C) 2011 Elsevier Ltd. All rights reserved.
Cell segmentation is challenging owing to the existence of various experimental configurations, cell shapes that cannot be mathematically defined, and ambiguous cell boundaries. We propose a cell segmentation method u...
详细信息
Cell segmentation is challenging owing to the existence of various experimental configurations, cell shapes that cannot be mathematically defined, and ambiguous cell boundaries. We propose a cell segmentation method using cell region discriminator and multi-cell discriminator trained using heterogeneous machine-learning techniques such as logistic regression, expectation-maximization, and support vector machine (SVM). The cell-region discriminator identifies the regions where cells are found in images obtained from microscopes via a secondary logistic regressor, and its features use statistical information as well as the distribution of neighbor intensities. The SVM-based multi-cell discriminator determines whether multiple cells are present in the region detected by the cell-region discriminator and whether the region should be divided using the expectation-maximization algorithm. We suggest features for the boundary sectional area and the least square error for cell surface fitting to train the multi-cell discriminator. Using the features and the SVM, the multi-cell discriminator can be trained without overfitting, even for small training data. During this process, the proposed convex cell surface enhances the clustering performance. In experiments, our method based on two discriminators stably divided connected cells even when the contrast between a cell and the background area was small, and it outperformed state-of-the-art methods in terms of cell detection and segmentation accuracy.
This study proposes an approach to modeling the effects of daily roadway conditions on travel time variability using a finite mixture model based on the Gamma-Gamma (GG) distribution. The GG distribution is a compound...
详细信息
This study proposes an approach to modeling the effects of daily roadway conditions on travel time variability using a finite mixture model based on the Gamma-Gamma (GG) distribution. The GG distribution is a compound distribution derived from the product of two Gamma random variates, which represent vehicle-to-vehicle and day-to-day variability, respectively. It provides a systematic way of investigating different variability dimensions reflected in travel time data. To identify the underlying distribution of each type of variability, this study first decomposes a mixture of Gamma-Gamma models into two separate Gamma mixture modeling problems and estimates the respective parameters using the expectation-maximization (EM) algorithm. The proposed methodology is demonstrated using simulated vehicle trajectories produced under daily scenarios constructed from historical weather and accident data. The parameter estimation results suggest that day-today variability exhibits clear heterogeneity under different weather conditions: clear versus rainy or snowy days, whereas the same weather conditions have little impact on vehicle-to-vehicle variability. Next, a two-component Gamma-Gamma mixture model is specified. The results of the distribution fitting show that the mixture model provides better fits to travel delay observations than the standard (one-component) Gamma-Gamma model. The proposed method, the application of the compound Gamma distribution combined with a mixture modeling approach, provides a powerful and flexible tool to capture not only different types of variability vehicle-to-vehicle and day-to-day variability but also the unobserved heterogeneity within these variability types, thereby allowing the modeling of the underlying distributions of individual travel delays across different days with varying roadway disruption levels in a more effective and systematic way. (C) 2014 Elsevier Ltd. All rights reserved.
When searching for gene pathways leading to specific disease outcomes, additional information on gene characteristics is often available that may facilitate to differentiate genes related to the disease from irrelevan...
详细信息
When searching for gene pathways leading to specific disease outcomes, additional information on gene characteristics is often available that may facilitate to differentiate genes related to the disease from irrelevant background when connections involving both types of genes are observed and their relationships to the disease are unknown. We propose method to single out irrelevant background genes with the help of auxiliary information through a logistic regression, and cluster relevant genes into cohesive groups using the adjacency matrix. expectation-maximization algorithm is modified to maximize a joint pseudo-likelihood assuming latent indicators for relevance to the disease and latent group memberships as well as Poisson or multinomial distributed link numbers within and between groups. A robust version allowing arbitrary linkage patterns within the background is further derived. Asymptotic consistency of label assignments under the stochastic blockmodel is proven. Superior performance and robustness in finite samples are observed in simulation studies. The proposed robust method identifies previously missed gene sets underlying autism related neurological diseases using diverse data sources including de novo mutations, gene expressions, and protein-protein interactions.
An unsupervised change detection problem can be viewed as a classification problem with only two classes corresponding to the change and no-change areas, respectively. Thanks to its simplicity, image differencing is a...
详细信息
An unsupervised change detection problem can be viewed as a classification problem with only two classes corresponding to the change and no-change areas, respectively. Thanks to its simplicity, image differencing is a widely used approach to change detection. It is based on the idea of generating a difference image that represents the modulus of the spectral change vector associated with each pixel in the study area. To separate the "change" and "no-change" classes in the difference image, a simple thresholding-based procedure can be applied. However, the selection of the best threshold value is not a trivial problem. We investigate and compare several simple thresholding methods. The combination of the expectation-maximization algorithm with a thresholding method is also performed for the purpose of achieving a better estimate of the optimal threshold value. As an experimental investigation, a study area damaged by a forest fire is considered. Two Landsat TM images of the area acquired before and after the event are utilized to detect the burnt zones and to assess and compare the mentioned unsupervised change-detection methods. (C) 2002 Society of Photo-Optical Instrumentation Engineers.
暂无评论