When the field collected data is biased by unexpected errors due to sensors and measurement, simple Wiener process may fail to correctly estimate the true degradation path. Most existing studies assume additive Gaussi...
详细信息
When the field collected data is biased by unexpected errors due to sensors and measurement, simple Wiener process may fail to correctly estimate the true degradation path. Most existing studies assume additive Gaussian errors in the true degradation path to account for the effects of measurement errors. This assumption is prone to unexpected outliers during the data collection. To achieve a robust estimation for the underlying degradation process, we propose to model the measurement errors using a family of thick-tailed distributions, called Scale-Mixture Normal (SMN) distributions. The SMN distribution can be expressed as a Gaussian hierarchy structure, which is more robust to unexpected outliers. We develop an efficient Expectation-Maximum (em) algorithm incorporating the Variational Bayesian method to estimate the model parameters. We also derive the distribution of the remaining useful life for online monitoring. The efficiency of the model is verified by Monte Carlo simulations, and the performance of the proposed model on real data is illustrated by the application on hard disk drivers and thrust ball bearing degradation data.
With the surge of Internet of Things (IoT) applications using unmanned aerial vehicles (UAVs), there is a huge demand for an excellent complexity/power efficiency trade-off and channel fading resistance at the physica...
详细信息
With the surge of Internet of Things (IoT) applications using unmanned aerial vehicles (UAVs), there is a huge demand for an excellent complexity/power efficiency trade-off and channel fading resistance at the physical layer. In this paper, we consider the blind equalization of short-continuous-phase-modulated (CPM) burst for UAV-aided IoT. To solve the problems of the high complexity and poor convergence of short-burst CPM blind equalization, a novel turbo blind equalization algorithm is proposed based on establishing a new expectation-maximization Viterbi (emV) algorithm and turbo scheme. Firstly, a low complexity blind equalization algorithm is obtained by applying the soft-output Lazy Viterbi algorithm within the em algorithm iteration. Furthermore, a set of initializers that achieves a high global convergence probability is designed by the blind channel-acquisition (BCA) method. Meanwhile, a soft information iterative process is used to improve the system performance. Finally, the convergence, bit error rate, and real-time performance of iterative detection can be further improved effectively by using improved exchange methods of extrinsic information and the stopping criterion. The analysis and simulation results show that the proposed algorithm achieves a good blind equalization performance and low complexity.
Many multivariate statistical analysis methods and their corresponding probabilistic counterparts have been adopted to develop process monitoring models in recent decades. However, the insight-ful connections between ...
详细信息
Many multivariate statistical analysis methods and their corresponding probabilistic counterparts have been adopted to develop process monitoring models in recent decades. However, the insight-ful connections between them have rarely been studied. In this study, a generalized probabilistic monitoring model (GPMM) is developed with both random and sequential data. Since GPMM can be reduced to various probabilistic linear models under specific restrictions, it is adopted to analyze the connections between different monitoring methods. Using expectation maximization (em) algorithm, the parameters of GPMM are estimated for both random and sequential cases. Based on the obtained model parameters, statistics are designed for monitoring different aspects of the process system. Besides, the distributions of these statistics are rigorously derived and proved, so that the control limits can be calculated accordingly. After that, contribution analysis methods are presented for identifying faulty variables once the process anomalies are detected. Finally, the equivalence between monitoring models based on classical multivariate methods and their corresponding probabilistic graphic models is further investigated. The conclusions of this study are verified using a numerical example and the Tennessee Eastman (TE) process. Experimental results illustrate that the proposed monitoring statistics are subject to their corresponding distributions, and they are equivalent to statistics in classical deterministic models under specific restrictions. (C) 2022 Elsevier Ltd. All rights reserved.
Deoxyribonucleic acid, more commonly known as DNA, is a complex double helix-shaped molecule present in all living organisms and hosts thousands of genes. However, only a few genes exhibit differential expression and ...
详细信息
Deoxyribonucleic acid, more commonly known as DNA, is a complex double helix-shaped molecule present in all living organisms and hosts thousands of genes. However, only a few genes exhibit differential expression and play a vital role in a particular disease such as breast cancer. Microarray technology is one of the modern technologies developed to study these gene expressions. There are two major microarray technologies available for expression analysis: Spotted cDNA array and oligonucleotide array. The focus of our research is the statistical analysis of data that arises from the spotted cDNA microarray. Numerous models have been proposed in the literature to identify differentially expressed genes from the red and green intensities measured by the cDNA microarrays. Motivated by the Bayesian models described in Newton et al. (2001) and Mav and Chaganty (2004), we propose two models for the joint distribution of the red and green intensities using a Gaussian copula, which accounts for the dependence. In both models, we assume the marginals are distributed as gamma. The differentially expressed genes were identified by calculating the Bayes estimates of the differential expression under the first proposed copula model. The second copula model incorporates a latent Bernoulli variable, which indicates differential expression. The em algorithm is applied to calculate the posterior probabilities of differential expression for the second model. The posterior probabilities rank the genes. We conducted two simulation studies to check the parameter estimation for the Gaussian copula-based models. We show that our models improve the models given in Newton et al. (2001) and Mav and Chaganty (2004). We have also studied the use of Weibull distribution instead of gamma distribution for the marginals. Our analysis shows that the copula models withWeibull marginals provide a better fit and improve the identification of genes. Finally, we illustrate the application of our models o
Stochastic block models have known a flowering interest in the social network literature. They provide a tool for discovering communities and identifying clusters of individuals characterized by similar social behavio...
详细信息
Stochastic block models have known a flowering interest in the social network literature. They provide a tool for discovering communities and identifying clusters of individuals characterized by similar social behaviors. In this framework, full maximum likelihood estimates are not achievable due to the intractability of the likelihood function. For this reason, several approximate solutions are available in the literature. In this respect, a new and more efficient approximate method for estimating model parameters is introduced. This has a hybrid nature, in the sense that it exploits different features of existing methods. The proposal is illustrated by an intensive Monte Carlo simulation study and an application to a real-world network. (C) 2022 Elsevier B.V. All rights reserved.
We develop a variational Bayesian (VB) approach for estimating large-scale dynamic network models in the network autoregression framework. The VB approach allows for the automatic identification of the dynamic structu...
详细信息
We develop a variational Bayesian (VB) approach for estimating large-scale dynamic network models in the network autoregression framework. The VB approach allows for the automatic identification of the dynamic structure of such a model and obtains a direct approximation of the posterior density. Compared to the Markov chain Monte Carlo (MCMC)-based sampling approaches, the VB approach achieves enhanced computational efficiency without sacrificing estimation accuracy. In a real data analysis scenario of day-ahead natural gas flow prediction in the German gas transmission network with 51 nodes between October 2013 and September 2015, the VB approach delivers promising forecasting accuracy along with clearly detected structures in terms of dynamic dependence. (C) 2021 Elsevier B.V. All rights reserved.
We present a Gaussian Process - Latent Class Choice Model (GP-LCCM) to integrate a non parametric class of probabilistic machine learning within discrete choice models (DCMs). Gaussian Processes (GPs) are kernel-based...
详细信息
We present a Gaussian Process - Latent Class Choice Model (GP-LCCM) to integrate a non parametric class of probabilistic machine learning within discrete choice models (DCMs). Gaussian Processes (GPs) are kernel-based algorithms that incorporate expert knowledge by assuming priors over latent functions rather than priors over parameters, which makes them more flexible in addressing nonlinear problems. By integrating a Gaussian Process within a LCCM structure, we aim at improving discrete representations of unobserved heterogeneity. The proposed model would assign individuals probabilistically to behaviorally homogeneous clusters (latent classes) using GPs and simultaneously estimate class-specific choice models by relying on random utility models. Furthermore, we derive and implement an Expectation-Maximization (em) algorithm to jointly estimate/infer the hyperparameters of the GP kernel function and the class-specific choice parameters by relying on a Laplace approximation and gradient-based numerical optimization methods, respectively. The model is tested on two different mode choice applications and compared against different LCCM benchmarks. Results show that GP-LCCM allows for a more complex and flexible representation of heterogeneity and improves both in sample fit and out-of-sample predictive power. Moreover, behavioral and economic interpretability is maintained at the class-specific choice model level while local interpretation of the latent classes can still be achieved, although the non-parametric characteristic of GPs lessens the transparency of the model.
In an industrial context, the activity of sensors is recorded at a high frequency. A challenge is to automatically detect abnormal measurement behavior. Considering the sensor measures as functional data, the problem ...
详细信息
In an industrial context, the activity of sensors is recorded at a high frequency. A challenge is to automatically detect abnormal measurement behavior. Considering the sensor measures as functional data, the problem can be formulated as the detection of outliers in a multivariate functional data set. Due to the heterogeneity of this data set, the proposed contaminated mixture model both clusters the multivariate functional data into homogeneous groups and detects outliers. The main advantage of this procedure over its competitors is that it does not require to specify the proportion of outliers. Model inference is performed through an Expectation-Conditional Maximization algorithm, and the BIC is used to select the number of clusters. Numerical experiments on simulated data demonstrate the high performance achieved by the inference algorithm. In particular, the proposed model outperforms the competitors. Its application on the real data which motivated this study allows to correctly detect abnormal behaviors. (c) 2022 Elsevier B.V. All rights reserved.
State-space models (SSM) with Markov switching offer a powerful framework for detecting multiple regimes in time series, analyzing mutual dependence and dynamics within regimes, and assessing transitions between regim...
详细信息
State-space models (SSM) with Markov switching offer a powerful framework for detecting multiple regimes in time series, analyzing mutual dependence and dynamics within regimes, and assessing transitions between regimes. These models however present considerable computational challenges due to the exponential number of possible regime sequences to account for. In addition, high dimensionality of time series can hinder likelihood-based inference. To address these challenges, novel statistical methods for Markov-switching SSMs are proposed using maximum likelihood estimation, Expectation-Maximization (em), and parametric bootstrap. Solutions are developed for initializing the em algorithm, accelerating convergence, and conducting inference. These methods, which are ideally suited to massive spatio-temporal data such as brain signals, are evaluated in simulations and applications to EEG studies of epilepsy and of motor imagery are presented.(C) 2022 Elsevier B.V. All rights reserved.
We study distributions connected with joint events (X, N), where X is the sum of N non-negative random variables {Xi}, which may be dependent or independent of N, and joint events (X, N, Y, M). In the latter, the X an...
详细信息
We study distributions connected with joint events (X, N), where X is the sum of N non-negative random variables {Xi}, which may be dependent or independent of N, and joint events (X, N, Y, M). In the latter, the X and Y are the sums of N and M non-negative random variables {Xi} and {Yi}, respectively. Models of this kind arise quite naturally in several areas such as finance, actuarial science, hydro-climatic studies and others. In finance, the quantity N may represents a duration of growth in value of an investment, where a group of consecutive values of log-returns are of the same positive sign and X is its cumulative log-return. In turn, the quantity M may represents a duration of decline in value of an investment, where a group of consecutive values of log-returns are of the same negative sign and Y is its cumulative log-return. In actuarial science, the N represents claim frequency and X represents the aggregate claim amount in a given time period. In clinical studies, N represents the number of hospital visits and X corresponds to the cumulative hospitalization cost. Our results include generalizations and formulation of bivariate and multivariate distributions that go beyond already existing models. In addition to theoretical results, we also consider the practical problem of parameter estimation using maximum likelihood, E-M algorithms, and simulation studies to validate our estimation strategies and applications
暂无评论