检索结果-内蒙古大学图书馆

Learning Mixtures of Experts with EM

学校读者我要写书评

暂无评论

arXiv 2024年

作者： Fruytier, Quentin Mokhtari, Aryan Sanghavi, Sujay Department of Electrical and Computer Engineering The University of Texas at Austin AustinTX United States

Mixtures of Experts (MoE) are Machine Learning models that involve partitioning the input space, with a separate "expert" model trained on each partition. Recently, MoE have become popular as components in today's large language models as a means to reduce training and inference costs. There, the partitioning function and the experts are both learnt jointly via gradient descent on the log-likelihood. In this paper we focus on studying the efficiency of the expectation maximization (EM) algorithm for the training of MoE models. We first rigorously analyze EM for the cases of linear or logistic experts, where we show that EM is equivalent to Mirror Descent with unit step size and a Kullback-Leibler Divergence regularizer. This perspective allows us to derive new convergence results and identify conditions for local linear convergence based on the signal-to-noise ratio (SNR). Experiments on synthetic and (small-scale) real-world data show that EM outperforms the gradient descent algorithm both in terms of convergence rate and the achieved accuracy. Copyright © 2024, The Authors. All rights reserved.

关键词： expectation maximization algorithm

EnsIR: An Ensemble algorithm for Image Restoration via Gaussian Mixture Models

学校读者我要写书评

暂无评论

arXiv 2024年

作者： Sun, Shangquan Ren, Wenqi Liu, Zikun Park, Hyunhee Wang, Rui Cao, Xiaochun Institute of Information Engineering Chinese Academy of Sciences Beijing China School of Cyber Security University of Chinese Academy of Sciences Beijing China School of Cyber Science and Technology Shenzhen Campus of Sun Yat-sen University China Guangdong Provincial Key Laboratory of Information Security Technology China China Camera Innovation Group Samsung Electronics Korea Republic of

Image restoration has experienced significant advancements due to the development of deep learning. Nevertheless, it encounters challenges related to ill-posed problems, resulting in deviations between single model predictions and ground-truths. Ensemble learning, as a powerful machine learning technique, aims to address these deviations by combining the predictions of multiple base models. Most existing works adopt ensemble learning during the design of restoration models, while only limited research focuses on the inference-stage ensemble of pre-trained restoration models. Regression-based methods fail to enable efficient inference, leading researchers in academia and industry to prefer averaging as their choice for post-training ensemble. To address this, we reformulate the ensemble problem of image restoration into Gaussian mixture models (GMMs) and employ an expectation maximization (EM)-based algorithm to estimate ensemble weights for aggregating prediction candidates. We estimate the range-wise ensemble weights on a reference set and store them in a lookup table (LUT) for efficient ensemble inference on the test set. Our algorithm is model-agnostic and training-free, allowing seamless integration and enhancement of various pre-trained image restoration models. It consistently outperforms regression-based methods and averaging ensemble approaches on 14 benchmarks across 3 image restoration tasks, including super-resolution, deblurring and deraining. The codes and all estimated weights have been released in Github. © 2024, CC BY.

关键词： expectation maximization algorithm

Taming the Interacting Particle Langevin algorithm the superlinear case

学校读者我要写书评

暂无评论

arXiv 2024年

作者： Johnston, Tim Makras, Nikolaos Sabanis, Sotirios School of Mathematics University of Edinburgh United Kingdom The Alan Turing Institute United Kingdom National Technical University of Athens Greece

Recent advances in stochastic optimization have yielded the interacting particle Langevin algorithm (IPLA), which leverages the notion of interacting particle systems (IPS) to efficiently sample from approximate posterior densities. This becomes particularly crucial in relation to the framework of expectation-maximization (EM), where the E-step is computationally challenging or even intractable. Although prior research has focused on scenarios involving convex cases with gradients of log densities that grow at most linearly, our work extends this framework to include polynomial growth. Taming techniques are employed to produce an explicit discretization scheme that yields a new class of stable, under such non-linearities, algorithms which are called tamed interacting particle Langevin algorithms (tIPLA). We obtain non-asymptotic convergence error estimates in Wasserstein-2 distance for the new class under an optimal *** Codes 60H35 (Primary), 62D05 (Secondary) © 2024, CC BY.

关键词： expectation maximization algorithm

Improved Evolutionary algorithms for Submodular maximization with Cost Constraints

学校读者我要写书评

暂无评论

arXiv 2024年

作者： Zhu, Yanhui Basu, Samik Pavan, A. Department of Computer Science Iowa State University United States

We present an evolutionary algorithm evo-SMC for the problem of Submodular maximization under Cost constraints (SMC). Our algorithm achieves 1/2-approximation with a high probability 1 - 1/n within O(n2Kβ) iterations, where Kβ denotes the maximum size of a feasible solution set with cost constraint β. To the best of our knowledge, this is the best approximation guarantee offered by evolutionary algorithms for this problem. We further refine evo-SMC, and develop st-evo-SMC. This stochastic version yields a significantly faster algorithm while maintaining the approximation ratio of 1/2, with probability 1 - ϵ. The required number of iterations reduces to O(nKβ log (1/ϵ)/p), where the user defined parameters p ∈ (0, 1] represents the stochasticity probability, and ϵ ∈ (0, 1] denotes the error threshold. Finally, the empirical evaluations carried out through extensive experimentation substantiate the efficiency and effectiveness of our proposed algorithms. Our algorithms consistently outperform existing methods, producing higher-quality solutions. © 2024, CC BY.

关键词： expectation maximization algorithm

Fast and robust cross-validation-based scoring rule inference for spatial statistics

学校读者我要写书评

暂无评论

arXiv 2024年

作者： Ólafsdóttir, Helga Kristín Rootzén, Holger Bolin, David Department of Mathematical Sciences Chalmers University of Technology University of Gothenburg Gothenburg Sweden Computer Electrical and Mathematical Sciences and Engineering Division King Abdullah University of Science and Technology Thuwal Saudi Arabia

Scoring rules are aimed at evaluation of the quality of predictions, but can also be used for estimation of parameters in statistical models. We propose estimating parameters of multivariate spatial models by maximising the average leave-one-out cross-validation score. This method, LOOS, thus optimises predictions instead of maximising the likelihood. The method allows for fast computations for Gaussian models with sparse precision matrices, such as spatial Markov models. It also makes it possible to tailor the estimator’s robustness to outliers and their sensitivity to spatial variations of uncertainty through the choice of the scoring rule which is used in the maximisation. The effects of the choice of scoring rule which is used in LOOS are studied by simulation in terms of computation time, statistical efficiency, and robustness. Various popular scoring rules and a new scoring rule, the root score, are compared to maximum likelihood estimation. The results confirmed that for spatial Markov models the computation time for LOOS was much smaller than for maximum likelihood estimation. Furthermore, the standard deviations of parameter estimates were smaller for maximum likelihood estimation, although the differences often were small. The simulations also confirmed that the usage of a robust scoring rule results in robust LOOS estimates and that the robustness provides better predictive quality for spatial data with outliers. Finally, the new inference method was applied to ERA5 temperature reanalysis data for the contiguous United States and the average July temperature for the years 1940 to 2023, and this showed that the LOOS estimator provided parameter estimates that were more than a hundred times faster to compute compared to maximum-likelihood estimation, and resulted in a model with better predictive performance. © 2024, CC BY.

关键词： expectation maximization algorithm

Outlier-Insensitive Kalman Filtering: Theory and Applications

学校读者我要写书评

暂无评论

arXiv 2023年

作者： Truzman, Shunit Revach, Guy Shlezinger, Nir Klein, Itzik The Hatter Dept. of Marine Technologies University of Haifa Haifa Israel D-ITET ETH Zürich Switzerland The School of ECE Ben-Gurion University of the Negev Be’er Sheva Israel

State estimation of dynamical systems from noisy observations is a fundamental task in many applications. It is commonly addressed using the linear Kalman filter (KF), whose performance can significantly degrade in the presence of outliers in the observations, due to the sensitivity of its convex quadratic objective function. To mitigate such behavior, outlier detection algorithms can be applied. In this work, we propose a parameter-free algorithm which mitigates the harmful effect of outliers while requiring only a short iterative process of the standard KF’s update step. To that end, we model each potential outlier as a normal process with unknown variance and apply online estimation through either expectation maximization or alternating maximization algorithms. Simulations and field experiment evaluations demonstrate our method’s competitive performance, showcasing its robustness to outliers in filtering scenarios compared to alternative algorithms. Copyright © 2023, The Authors. All rights reserved.

关键词： expectation maximization algorithm

Discriminative Entropy Clustering and its Relation to K-means and SVM

学校读者我要写书评

暂无评论

arXiv 2023年

作者： Zhang, Zhongwen Rex Boykov, Yuri University of Waterloo Canada

maximization of mutual information between the model’s input and output is formally related to "decisiveness" and "fairness" of the softmax predictions [1], motivating these unsupervised entropy-based criteria for clustering. First, in the context of linear softmax models, we discuss some general properties of entropy-based clustering. Disproving some earlier claims, we point out fundamental differences with Kmeans. On the other hand, we prove the margin maximizing property for decisiveness establishing a relation to SVM-based clustering. Second, we propose a new self-labeling formulation of entropy clustering for general softmax models. The pseudo-labels are introduced as auxiliary variables "splitting" the fairness and decisiveness. The derived self-labeling loss includes the reverse cross-entropy robust to pseudo-label errors and allows an efficient EM solver for pseudo-labels. Our algorithm improves the state of the art on several standard benchmarks for deep clustering. Copyright © 2023, The Authors. All rights reserved.

关键词： expectation maximization algorithm

Nowcasting in triple-system estimation

学校读者我要写书评

暂无评论

arXiv 2024年

作者： Zult, Daan B. van der Heijden, Peter G.M. Bakker, Bart F.M. Statistics Netherlands Netherlands Utrecht University University of Southampton Netherlands Statistics Netherlands and VU University Amsterdam Netherlands

Multiple systems estimation uses samples that each cover part of a population to obtain a total population size estimate. Ideally, all the available samples are used, but if some samples are available (much) later, one may use only the samples that are available early. Under some regularity conditions, including sample independence, two samples is enough to obtain an asymptotically unbiased population size estimate. However, the assumption of sample independence may be unrealistic, especially when samples are derived from administrative sources. The sample independence assumption can be relaxed when three or more samples are used, which is therefore generally recommended. This may be a problem if the third sample is available much later than the first two samples. Therefore, in this paper we propose a new approach that deals with this issue by utilising older samples, using the so-called expectation maximisation algorithm. This leads to a population size nowcast estimate that is asymptotically unbiased under more relaxed assumptions than the estimate based on two samples. The resulting nowcasting model is applied to the problem of estimating the number of homeless people in The Netherlands, which leads to reasonably accurate nowcast estimates. © 2024, CC BY.

关键词： expectation maximization algorithm

A Dirichlet stochastic block model for composition-weighted networks

学校读者我要写书评

暂无评论

arXiv 2024年

作者： Promskaia, Iuliia O'Hagan, Adrian Fop, Michael SFI Insight Centre for Data Analytics Dublin Ireland School of Mathematics and Statistics University College Dublin Dublin Ireland

Network data are observed in various applications where the individual entities of the system interact with or are connected to each other, and very often these interactions are defined by their associated strength or importance. Clustering is a common task in network analysis that involves finding groups of nodes which display similarities in the way they interact with the rest of the network. However, most clustering methods use the strengths of connections between entities in their original form, ignoring the possible differences in the capacities of individual nodes to send or receive edges. This often leads to clustering solutions that are heavily influenced by the nodes' capacities. One way to overcome this is to analyse the strengths of connections in relative rather than absolute terms, expressing each edge weight as a proportion of the sending (or receiving) capacity of the respective node. This, however, induces additional modelling constraints that most existing clustering methods are not designed to handle. In this work we propose a stochastic block model for composition-weighted networks based on direct modelling of compositional weight vectors using a Dirichlet mixture, with the parameters determined by the cluster labels of the sender and the receiver nodes. Inference is implemented via an extension of the classification expectation-maximisation algorithm that uses a working independence assumption, expressing the complete data likelihood of each node of the network as a function of fixed cluster labels of the remaining nodes. A model selection criterion is derived to aid the choice of the number of clusters. An alternative approach to clustering in composition-weighted networks based on a mapping to the Euclidean space is also provided. The model is validated using a number of simulation studies, assessing the effect of various initialisation strategies on the model's performance, latent structure recovery, parameter estimation quality and model sele

关键词： expectation maximization algorithm