The present study proposes a novel dynamic mode decomposition (DMD) that can simultaneously estimate the reduced-order model, the original signal, and the system/observation noise model only from the noisy data. An ex...
详细信息
The present study proposes a novel dynamic mode decomposition (DMD) that can simultaneously estimate the reduced-order model, the original signal, and the system/observation noise model only from the noisy data. An expectation-maximization (EM)-algorithm DMD (EMDMD) combines DMD and the parameter adjustment of the linear dynamical system (LDS) based on the EM algorithm. The initial parameters based on the linearity of the reduced-order data are set by using DMD. Subsequently, the log-likelihood of the complete data is maximized by adjusting the LDS parameters while separating the noise. The proposed algorithm is applied to the benchmark data of the short-fat and tall-skinny data matrices with different noise and the time-series velocity fields of the flow around a circular cylinder and the separated flow around an airfoil. The performance of EMDMD in terms of system identification and noise separation from the noisy data is evaluated, and the EMDMD shows the highest system identification and noise separation performance in all data.
A Gaussian Mixture Model (GMM) is a parametric probability density function built as a weighted sum of Gaussian distributions. Gaussian mixtures are used for modelling the probability distribution in many fields of re...
详细信息
A Gaussian Mixture Model (GMM) is a parametric probability density function built as a weighted sum of Gaussian distributions. Gaussian mixtures are used for modelling the probability distribution in many fields of research nowadays. Nevertheless, in many real applications, the components are skewed or heavy tailed. For that reason, it is useful to model the mixtures as components with alpha-stable distribution. In this work, we present a mixture of skewed alpha-stable model where the parameters are estimated using the expectation-maximization algorithm. As the Gaussian distribution is a particular limiting case of alpha-stable distribution, the proposed model is a generalization of the widely used GMM. The proposed algorithm is much faster than the parameter estimation of the alpha-stable mixture model using a Bayesian approach and Markov chain Monte Carlo methods. Therefore, it is more suitable to be used for large vector observations. (C) 2020 Elsevier B.V. All rights reserved.
We develop an expectation-maximization algorithm with local adaptivity for image segmentation and classification. The key idea of our approach is to combine global statistics extracted from the Gaussian mixture model ...
详细信息
We develop an expectation-maximization algorithm with local adaptivity for image segmentation and classification. The key idea of our approach is to combine global statistics extracted from the Gaussian mixture model or other proper statistical models with local statistics and geometrical information, such as local probability distribution, orientation, and anisotropy. The combined information is used to design an adaptive local classification strategy that improves the robustness of the algorithm and also keeps fine features in the image. The proposed methodology is flexible and can be easily generalized to deal with other inferred information/quantities and statistical methods/models.
Directional statistical distributions can be used to model a wide range of industrial and phenomena. Finite mixtures of circular normal von Mises (MvM) distributions have been used to represent directional data from v...
详细信息
ISBN:
(纸本)9781479979929
Directional statistical distributions can be used to model a wide range of industrial and phenomena. Finite mixtures of circular normal von Mises (MvM) distributions have been used to represent directional data from various domains including energy industry, medical science, and information retrieval. This paper presents the probabilistic modeling of the prevailing wind directions. expectation-maximization algorithm (EM algorithm) is employed to evaluate unknown parameters of MvM distribution. The evaluation is carried out using real-world data sets describing annual wind direction at St. John's airport in Newfoundland, Canada. Experimental results show that EM algorithm is able to find good model parameters corresponding to input data. However, because the termination criterion chi(2) - function converges to 335, the resulting distribution cannot pass Pearson's test of goodness of fit.
In this paper, the goodness-of-fit test based on a convex combination of Akaike and Bayesian information criteria is used to explain the features of interoccurrence times of earthquakes. By analyzing the seismic catal...
详细信息
In this paper, the goodness-of-fit test based on a convex combination of Akaike and Bayesian information criteria is used to explain the features of interoccurrence times of earthquakes. By analyzing the seismic catalog of Iran for different tectonic settings, we have found that the probability distributions of time intervals between successive earthquakes can be described by the generalized normal distribution. This indicates that the sequence of successive earthquakes is not a Poisson process. It is found that by decreasing the threshold magnitude, the interoccurrence time distribution changes from the generalized normal distribution to the gamma distribution in some seismotectonic regions. As a new insight, the probability distribution of time intervals between earthquakes is described as a mixture distribution via the expectation-maximization algorithm.
Scene text detection is an important and challenging task in computer vision. For detecting arbitrarily-shaped texts, most existing methods require heavy data labeling efforts to produce polygon-level text region labe...
详细信息
Scene text detection is an important and challenging task in computer vision. For detecting arbitrarily-shaped texts, most existing methods require heavy data labeling efforts to produce polygon-level text region labels for supervised training. In order to reduce the cost in data labeling, we study mixed-supervised arbitrarily-shaped text detection by combining various weak supervision forms (e.g., image-level tags, coarse, loose and tight bounding boxes), which are far easier to annotate. Whereas the existing weakly-supervised learning methods (such as multiple instance learning) do not promote full object coverage, to approximate the performance of fully-supervised detection, we propose an expectation-maximization (EM) based mixed-supervised learning framework to train scene text detector using only a small amount of polygon-level annotated data combined with a large amount of weakly annotated data. The polygon-level labels are treated as latent variables and recovered from the weak labels by the EM algorithm. A new contour-based scene text detector is also proposed to facilitate the use of weak labels in our mixed-supervised learning framework. Extensive experiments on six scene text benchmarks show that (1) using only 10% strongly annotated data and 90% weakly annotated data, our method yields comparable performance to that of fully supervised methods, (2) with 100% strongly annotated data, our method achieves state-of-the-art performance on five scene text benchmarks (CTW1500, Total-Text, ICDAR-ArT, MSRA-TD500, and C-SVT), and competitive results on the ICDAR2015 Dataset. We will make our weakly annotated datasets publicly available.
An efficient initialization of the expectation-maximization algorithm to estimate mixture models via maximum likelihood is proposed. A fully unsupervised network-based initial-ization technique is provided by mapping ...
详细信息
An efficient initialization of the expectation-maximization algorithm to estimate mixture models via maximum likelihood is proposed. A fully unsupervised network-based initial-ization technique is provided by mapping time series to complex networks using as adja-cency matrix the Markov Transition Field associated to the time series. In this way, the optimal number of mixture model components and the vector of initial parameters can be directly obtained. An experiment conducted on financial times series with very different characteristics shows that our approach produces significantly better results if compared to conventional methods of initialization, such as K-means and Random, thus demonstrat-ing the effectiveness of the proposed method.(c) 2022 Elsevier Inc. All rights reserved.
We develop a forward-reverse expectation-maximization (FREM) algorithm for estimating parameters of a discrete-time Markov chain evolving through a certain measurable state-space. For the construction of the FREM meth...
详细信息
We develop a forward-reverse expectation-maximization (FREM) algorithm for estimating parameters of a discrete-time Markov chain evolving through a certain measurable state-space. For the construction of the FREM method, we develop forward-reverse representations for Markov chains conditioned on a certain terminal state. We prove almost sure convergence of our algorithm for a Markov chain model with curved exponential family structure. On the numerical side, we carry out a complexity analysis of the forward-reverse algorithm by deriving its expected cost. Two application examples are discussed.
The controlled branching process is a generalization of the classical Bienayme-Galton-Watson branching process. It is a useful model for describing the evolution of populations in which the population size at each gen...
详细信息
The controlled branching process is a generalization of the classical Bienayme-Galton-Watson branching process. It is a useful model for describing the evolution of populations in which the population size at each generation needs to be controlled. The maximum likelihood estimation of the parameters of interest for this process is addressed under various sample schemes. Firstly, assuming that the entire family tree can be observed, the corresponding estimators are obtained and their asymptotic properties investigated. Secondly, since in practice it is not usual to observe such a sample, the maximum likelihood estimation is initially considered using the sample given by the total number of individuals and progenitors of each generation, and then using the sample given by only the generation sizes. expectation-maximization algorithms are developed to address these problems as incomplete data estimation problems. The accuracy of the procedures is illustrated by means of a simulated example. (C) 2015 Elsevier B.V. All rights reserved.
Estimating the parameters of geophysical dynamic models is an important task in the data assimilation (DA) techniques used to forecast initialization and reanalysis. In the past, most parameter estimation strategies w...
详细信息
Estimating the parameters of geophysical dynamic models is an important task in the data assimilation (DA) techniques used to forecast initialization and reanalysis. In the past, most parameter estimation strategies were derived by state augmentation, yielding algorithms that are easy to implement but may exhibit convergence difficulties. The expectation-maximization (EM) algorithm is considered advantageous because it employs two iterative steps to estimate the model state and the model parameter separately. In this work, we propose a novel ensemble formulation of the maximization step in EM that allows a direct optimal estimation of the physical parameters using iterative methods for linear systems. This departs from current EM formulations that are only capable of dealing with additive model error structures. This contribution shows how the EM technique can be used for dynamics identification problems with the model error parameterized as an arbitrary complex form. The proposed technique is used here for the identification of stochastic subgrid terms that account for processes unresolved by a geophysical fluid model. This method, together with the augmented state technique, is evaluated to estimate such subgrid terms through high-resolution data. As compared to the augmented state technique, our method is shown to yield considerably more accurate parameters. In addition, in terms of prediction capacity, it leads to a smaller generalization error as a result of the overfitting of the trained model on the presented data and eventually better forecasts.
暂无评论