In this paper, we propose a penalized estimation method for finite mixture of ultra-high dimensional regression models. A two-step procedure is explored. Firstly, we conduct order selection with the number of componen...
详细信息
In this paper, we propose a penalized estimation method for finite mixture of ultra-high dimensional regression models. A two-step procedure is explored. Firstly, we conduct order selection with the number of components unknown. Then variable selection is applied to ultra-high dimensional regression models. A specific em algorithm is designed to maximize penalized log-likelihood function. We demonstrate our method by numerical simulations which performs well. Further, an empirical study of return on equity (ROE) prediction is shown to consolidate our methodology.
In this paper, the stress-strength reliability of single and multi-component systems are estimated assuming discrete phase type distribution for stress and strength components. The systems with strength following mixt...
详细信息
In this paper, the stress-strength reliability of single and multi-component systems are estimated assuming discrete phase type distribution for stress and strength components. The systems with strength following mixture of discrete phase type distributions is also considered. Matrix based expressions are obtained for stress-strength reliability and its maximum likelihood estimate is obtained using em algorithm. The numerical illustration using various special cases of discrete phase type distribution like geometric, negative binomial, generalized negative binomial and different mixtures of discrete distributions are also carried out.
Various manipulations on JPEG images introduce single and multiple compression artifacts for forged and unmodified areas respectively. Based on the statistical analysis of JPEG compression cycle and on the finite mixt...
详细信息
Various manipulations on JPEG images introduce single and multiple compression artifacts for forged and unmodified areas respectively. Based on the statistical analysis of JPEG compression cycle and on the finite mixture paradigm, we propose in this paper a modeling framework for AC DCT coefficients of such tampered JPEG images. Its accuracy is numerically assessed using the Kullback-Leibler divergence on the basis of a tampered JPEG image dataset built from six well-known uncompressed color image databases. To illustrate the framework utility, an application in image forgery localization is proposed. By formulating the localization as a clustering problem, we use the plug-in Bayes rule combined with a simple em algorithm to distinguish between forged and unmodified areas. Numerous experiments show that, when the quality factor of final JPEG compression is high enough, the proposed modeling framework yields higher localization performances in terms of F-1-score than prior art regardless of divers local manipulations.
In this article, the parameter learning problem is studied for stochastic Boolean networks (SBNs). Both the measure noise and the system noise are assumed to be white and modeled by sequences of Bernoulli distributed ...
详细信息
In this article, the parameter learning problem is studied for stochastic Boolean networks (SBNs). Both the measure noise and the system noise are assumed to be white and modeled by sequences of Bernoulli distributed stochastic variables which are mutually independent. An algebraic representation of the SBNs is obtained by taking advantage of vector expression of logic variable and applying the semi-tensor product technique. Consequently, the parameter learning problem is reformulated as an optimization problem that makes it possible to identify the system matrices of SBNs in an efficient computation way. Subsequently, properties of forward and backward probabilities are investigated, and the em algorithm is utilized to learn the model parameters from time series data. Finally, a numerical experiment is presented to show the usefulness of the designed parameter learning algorithm.
Multi-Source Domain Adaptation (MSDA) is widely used in various machine learning scenarios for domain shifts between labeled source domains and unlabeled target domains. Conventional MSDA methods are built on a strong...
详细信息
ISBN:
(纸本)9781450394161
Multi-Source Domain Adaptation (MSDA) is widely used in various machine learning scenarios for domain shifts between labeled source domains and unlabeled target domains. Conventional MSDA methods are built on a strong hypothesis that data samples from the same source belong to the same domain with the same latent distribution. However, in practice sources and their latent domains are not necessarily one-to-one correspondence. To tackle this problem, a novel Multi-source Reconstructed Domain Adaptation (MRDA) framework for MSDA is proposed. We use an Expectation-Maximization (em) mechanism that iteratively reconstructs the source domains to recover the latent domains and performs domain adaptation on the reconstructed domains. Specifcally, in the E-step, we cluster the samples from multiple sources into diferent latent domains, and a soft assignment strategy is proposed to avoid cluster imbalance. In the M-step, we freeze the latent domains clustered in the E-step and optimize the objective function for domain adaptation, and a global-specifc feature extractor is used to capture both domain-invariant and domain-specifc features. Extensive experiments demonstrate that our approach can reconstruct source domains and perform domain adaptation on the reconstructed domains efectively, thus signifcantly outperforming state-of-the-art (SOTA) baselines (e.g., 1% to 3.1% absolute improvement in AUC).
Many engineering products have more than one failure mode and the evolution of each mode can be monitored by measuring a performance characteristic (PC). It is found that the underlying multi-dimensional degradation o...
详细信息
Many engineering products have more than one failure mode and the evolution of each mode can be monitored by measuring a performance characteristic (PC). It is found that the underlying multi-dimensional degradation often occurs with inherent process stochasticity and heterogeneity across units, as well as dependency among PCs. To accommodate these features, in this paper, we propose a novel multivariate degradation model based on the inverse Gaussian process. The model incorporates random effects that are subject to a multivariate normal distribution to capture both the unit-wise variability and the PC-wise dependence. Built upon this structure, we obtain some mathematically tractable properties such as the joint and conditional distribution functions, which subsequently facilitate the future degrada-tion prediction and lifetime estimation. An expectation-maximization algorithm is developed to infer the model parameters along with the validation tools for model checking. In addition, two simulation studies are performed to assess the performance of the inference method and to evaluate the effect of model misspecification. Finally, the application of the proposed methodology is demonstrated by two illustrative examples. (c) 2021 Elsevier B.V. All rights reserved.
Causal inference is a process of uncovering causal relationship between effect variable and disease outcome in epidemiologic research. When estimating causal effect in observational studies, confounders that influence...
详细信息
Causal inference is a process of uncovering causal relationship between effect variable and disease outcome in epidemiologic research. When estimating causal effect in observational studies, confounders that influence both the effect variable and the outcome need to be adjusted for in the estimation process. In addition, missing data often arise in data collection procedure;working with complete cases often results in biased parameter estimates. We consider the causal effect estimation in the presence of missingness in the confounders under the missing at random assumption. We investigate how the double robust estimators perform when applying complete-case analysis or multiple imputations. Given the uncertainty of appropriate imputation model and computational challenge for many imputations, we propose an expectation-maximization (em) algorithm to estimate the expected values of the missing confounder and utilize a weighting approach in the estimation of the average treatment effect. Simulation studies are conducted to see whether there is any gain in estimation efficiency using the proposed method, instead of the complete case analysis and multiple imputations. The results identified em as the most efficient and accurate method for dealing with missingness in confounder. Our study result is applied in a B-aware trial, which is a multi-centre clinical trial, to estimate the effect of total intravenous anaesthetic on post-operative anxiety.
Kim (J. Korean Stat. Soc. 37 (2008) 81-87) introduced an incor-rect stochastic representation (SR) for the truncated Student-t (Tt) random variable. By pointing out that the gamma mixture based on a truncated nor-mal ...
详细信息
Kim (J. Korean Stat. Soc. 37 (2008) 81-87) introduced an incor-rect stochastic representation (SR) for the truncated Student-t (Tt) random variable. By pointing out that the gamma mixture based on a truncated nor-mal distribution actually cannot result in a true Tt distribution, in this paper, we first propose three correct SRs and then recalculate the corresponding moments of the Tt distribution. Different from those derived by following the invalid SR of Kim (J. Korean Stat. Soc. 37 (2008) 81-87), the correct moments of the Tt distribution play a crucial role in parameter estimations. Based on the third SR proposed and the correct expressions of truncated mo-ments, expectation-maximization (em) algorithms are developed for calcu-lating the maximum likelihood estimates of parameters in the Tt distribu-tion. Extensions to a Tt regression model and a t interval-censored regression model are provided as well. Simulated experiments are conducted to evalu-ate the performance of the proposed methods. Finally, two real data analyses corroborate the theoretical results.
In this work, we addressthe MIMO semi-blindchannel estimation problem. We propose an eigenvalue decomposition based technique to significantly reduce the dimensionality of the em based algorithm when the imposed prior...
详细信息
In this work, we addressthe MIMO semi-blindchannel estimation problem. We propose an eigenvalue decomposition based technique to significantly reduce the dimensionality of the em based algorithm when the imposed prior on the data is Gaussian, greatly lowering the computational complexity. In addition to that, we apply the Minimum Power Distortionless Response (MPDR) decoupling principle to derive a tractable em algorithm that uses the actual discrete prior of the data symbols. Our results show that the proposed MPDR based algorithm has superior performance over other em based algorithms in both low and high SNR regions. The results also show that a faster version of the algorithm can be obtained by initializing it using the eigenvalue decomposition based Gaussian algorithm.
Imaging Mass Cytometry (IMC), a multiplexed imaging technology, has become a valuable tool in biomedical research due to its capability to measure over 100 markers theoretically. However, the presence of noise in IMC ...
详细信息
暂无评论