This paper advocates a new subspace system identification algorithm for the errors-in-variables (EIV) state space model via the emalgorithm. To initialize the emalgorithm an initial estimate is obtained by the error...
详细信息
This paper advocates a new subspace system identification algorithm for the errors-in-variables (EIV) state space model via the emalgorithm. To initialize the emalgorithm an initial estimate is obtained by the errors-in-variables subspace system identification method: EIV-MOESP (Chou et al. [1997]) and EIV-N4SID (Gustafsson [2001]). The emalgorithm is an algorithm to compute the maximum value for the likelihood function that is consists of two steps; namely the E- and M-steps. The E- and M-steps in the emalgorithm are calculated by computing the conditional expectation under the assumption that the input-output data is completely observed. Numerical example shows that the emalgorithm can monotonically improve the initial estimates obtained by subspace identification methods.
Current status data appear in many biomedical studies when we only know if an event of interest occurs before or after a specific time point. In this paper, we develop statistical inference for the estimation of param...
详细信息
Current status data appear in many biomedical studies when we only know if an event of interest occurs before or after a specific time point. In this paper, we develop statistical inference for the estimation of parameters from current status data under the Lindley lifetime distribution, which is seen to work better than the exponential distribution in some lifetime contexts. We first develop an emalgorithm for Maximum Likelihood (ML) estimation and derive the asymptotic confidence intervals for model parameters. Then, we address the problem of model misspecification and define a new family of robust divergence-based estimators as a robust alternative to ML. Finally, we illustrate these methods through a simulation study as well as a numerical example.
Two approaches to the problem of statistical separation of finite mixtures of probability distributions are discussed. The first of them consists in finding maximum likelihood estimates of the parameters of the mixtur...
详细信息
Two approaches to the problem of statistical separation of finite mixtures of probability distributions are discussed. The first of them consists in finding maximum likelihood estimates of the parameters of the mixture by the em-algorithm, whereas the second approach consists in finding the values of the parameters that deliver minimum to the distance between the theoretical and empirical distribution functions. It is demonstrated that the second approach is preferable at least in the problem of statistical reconstruction of the coefficients of an It & ocirc;stochastic process that requires dynamic separation of finite normal mixtures in the moving window mode. For this problem, the performance of the numerical procedures is critical. A combination of numerical procedures is described that provides (almost) the same value of the likelihood function for the second approach that is attained by the em-algorithm, but ensures multiple decrease of the & ell;2-distance between the theoretical mixture and the empirical distribution function while demonstrating better performance. A kind of 'metric' regularization of the problem of likelihood maximization is proposed. The proposed techniques are illustrated by adjusting the It & ocirc;process model to the time series of the interplanetary magnetic field (magnetic flux density) registered by the Global Geospace Science (GGS) Wind apparatus (the spacecraft placed in the Lagrange point between the Earth and the Sun).
The heavy-tailed multivariate normal inverse Gaussian (MNIG) distribution is a recent variance-mean mixture of a multivariate Gaussian with a univariate inverse Gaussian distribution. Due to the complexity of the like...
详细信息
The heavy-tailed multivariate normal inverse Gaussian (MNIG) distribution is a recent variance-mean mixture of a multivariate Gaussian with a univariate inverse Gaussian distribution. Due to the complexity of the likelihood function, parameter estimation by direct maximization is exceedingly difficult. To overcome this problem, we propose a fast and accurate multivariate expectation-maximization (em) algorithm for maximum likelihood estimation of the scalar, vector, and matrix parameters of the MNIG distribution. Important fundamental and attractive properties of the MNIG as a modeling tool for multivariate heavy-tailed processes are discussed. The modeling strength of the MNIG, and the feasibility of the proposed em parameter estimation algorithm, are demonstrated by fitting the MNIG to real world hydrophone data, to wideband synthetic aperture sonar data, and to multichannel radar sea clutter data. (c) 2005 Elsevier B.V. All rights reserved.
Estimating a channel that is subject to frequency-selective Rayleigh fading is a challenging problem in an orthogonal frequency division multiplexing (OFDM) system. We propose three em-based algorithms to efficiently ...
详细信息
Estimating a channel that is subject to frequency-selective Rayleigh fading is a challenging problem in an orthogonal frequency division multiplexing (OFDM) system. We propose three em-based algorithms to efficiently estimate the channel impulse response (CIR) or channel frequency response of such a system operating on a channel with multipath fading and additive white Gaussian noise (AWGN). These algorithms are capable of improving the channel estimate by making use of a modest number of pilot tones or using the channel estimate of the previous frame to obtain the initial estimate for the iterative procedure. Simulation results show that the bit error rate (BER) as well as the mean square error (MSE) of the channel can be significantly reduced by these algorithms. We present simulation results to compare these algorithms on the basis of their performance and rate of convergence. We also derive Cramer-Rao-like lower bounds for the unbiased channel estimate, which can be achieved via these em-based algorithms. It is shown that the convergence rate of two of the algorithms is independent of the length of the multipath spread. One of them also converges most rapidly and has the smallest overall computational burden.
Generalized Hyperbolic distribution (Barndorff-Nielsen 1977) is a variance-mean mixture of a normal distribution with the Generalized Inverse Gaussian distribution. Recently subclasses of these distributions (e.g., th...
详细信息
Generalized Hyperbolic distribution (Barndorff-Nielsen 1977) is a variance-mean mixture of a normal distribution with the Generalized Inverse Gaussian distribution. Recently subclasses of these distributions (e.g., the hyperbolic distribution and the Normal Inverse Gaussian distribution) have been applied to construct stochastic processes in turbulence and particularly in finance, where multidimensional problems are of special interest. Parameter estimation for these distributions based on an i.i.d. sample is a difficult task even for a specified one-dimensional subclass (subclass being uniquely defined by.) and relies on numerical methods. For the hyperbolic subclass (lambda = 1), computer program 'hyp' (Blaesild and Sorensen 1992) estimates parameters via ML when the dimensionality is less than or equal to three. To the best of the author's knowledge, no successful attempts have been made to fit any given subclass when the dimensionality is greater than three. This article proposes a simple em-based ( Dempster, Laird and Rubin 1977) ML estimation procedure to estimate parameters of the distribution when the subclass is known regardless of the dimensionality. Our method relies on the ability to numerically evaluate modified Bessel functions of the third kind and their logarithms, which is made possible by currently available software. The method is applied to fit the five dimensional Normal Inverse Gaussian distribution to a series of returns on foreign exchange rates.
Suppose survival times follow an exponential distribution, and some observations are right-censored: in this situation the emalgorithm gives a straightforward solution to the problem of maximum likelihood estimation....
详细信息
Suppose survival times follow an exponential distribution, and some observations are right-censored: in this situation the emalgorithm gives a straightforward solution to the problem of maximum likelihood estimation. But what happens if survival times are also left-censored, or if they follow a uniform distribution? The emalgorithm is a generic device useful in a variety of problems with incomplete data, and it appears more and more often in statistical textbooks. This article presents two exercises, which are extensions of a well-known example used in introductions to the emalgorithm. They focus on two points: the applicability of the algorithm and its self-consistency property.
In the paper soft probabilistic clustering algorithm of multidimensional data sets that are sequentially fed to processing in on-line mode is investigated. The proposed system solves the tasks of Data Stream Mining wh...
详细信息
ISBN:
(纸本)9781509037360
In the paper soft probabilistic clustering algorithm of multidimensional data sets that are sequentially fed to processing in on-line mode is investigated. The proposed system solves the tasks of Data Stream Mining when classes are overlapped.
Entity alignment is the problem of identifying which entities in a data source refer to the same real-world entity in the *** entities across heterogeneous data sources is paramount to many research fields,such as dat...
详细信息
Entity alignment is the problem of identifying which entities in a data source refer to the same real-world entity in the *** entities across heterogeneous data sources is paramount to many research fields,such as data cleaning,data integration,information retrieval and machine *** aligning process is not only overwhelmingly expensive for large data sources since it involves all tuples from two or more data sources,but also need to handle heterogeneous entity *** this paper,we propose an unsupervised approach,called EnAli,to match entities across two or more heterogeneous data *** employs a generative probabilistic model to incorporate the heterogeneous entity attributes via employing exponential family,handle missing values,and also utilize the locality sensitive hashing schema to reduce the candidate tuples and speed up the aligning *** is highly accurate and efficient even without any ground-truth *** illustrate the performance of EnAli on re-identifying entities from the same data source,as well as aligning entities across three real data *** experimental results manifest that our proposed approach outperforms the comparable baseline.
The characterization of discontinuities within rock masses is often accomplished using stochastic discontinuity network models, in which the stochastic nature of the discontinuity network is represented by means of st...
详细信息
The characterization of discontinuities within rock masses is often accomplished using stochastic discontinuity network models, in which the stochastic nature of the discontinuity network is represented by means of statistical distributions. We present a flexible methodology for maximum likelihood inference of the distribution of discontinuity trace lengths based oil observations at rock Outcrops. The inference problem is formulated using statistical graphical models and target distributions with several Gaussian mixture components. We use the Expectation-Maximization algorithm to exploit the relations of conditional independence between variables in the maximum likelihood estimation problem. Initial results using artificially generated discontinuity traces show that the method has good inference capabilities, and inferred trace length distributions closely reproduce those used for generation. In addition, the convergence of the algorithm is shown to be fast. (c) 2006 Elsevier Ltd. All rights reserved.
暂无评论