We show the linear convergence of a simple first-order algorithm for the minimum-volume enclosing ellipsoid problem and its dual, the D-optimal design problem of statistics. Using similar techniques, we show the linea...
详细信息
We show the linear convergence of a simple first-order algorithm for the minimum-volume enclosing ellipsoid problem and its dual, the D-optimal design problem of statistics. Using similar techniques, we show the linear convergence of the frank-wolfe algorithm with away steps applied to the simplex, under conditions different from those of Gulat and Marcotte. Computational tests confirm the attractive features of this method.
The frank-wolfe algorithm is a popular method for minimizing a smooth convex function f over a compact convex set C. Whereas many convergence results have been derived in terms of function values, almost nothing is kn...
详细信息
The frank-wolfe algorithm is a popular method for minimizing a smooth convex function f over a compact convex set C. Whereas many convergence results have been derived in terms of function values, almost nothing is known about the convergence behavior of the sequence of iterates (xt)t is an element of N. Under the usual assumptions, we design several counterexamples to the convergence of (xt)t is an element of N, where f is d-time continuously differentiable, dP 2, and f(xt) -> minC f. Our counterexamples cover the cases of open-loop, closed-loop, and line-search step-size strategies and work for any choice of the linear minimization oracle, thus demonstrating the fundamental pathologies in the convergence behavior of (xt)t is an element of N.
Online convex optimization (OCO) encapsulates supervised learning when training sets are large-scale or dynamic, and has grown essential as data has proliferated. OCO decomposes learning into a sequence of sub-problem...
详细信息
Online convex optimization (OCO) encapsulates supervised learning when training sets are large-scale or dynamic, and has grown essential as data has proliferated. OCO decomposes learning into a sequence of sub-problems, each of which must be solved with limited information. To ensure safe model adaption or to avoid overfitting, constraints are often imposed, which are often addressed with projections or Lagrangian relaxation. To avoid this complexity incursion, we propose to study frank-wolfe (FW), which operates by updating in collinear directions with the gradient but guaranteed to be feasible. We specifically focus on its use in non-stationary settings, motivated by the fact that its iterates have structured sparsity that may be employed as a distributionfree change-point detector. We establish performance in terms of dynamic regret, which quantifies cost accumulation as compared with the optimal at each individual time slot. Specifically, for convex losses, we establish O(T-1/2) dynamic regret up to metrics of non-stationarity. We relax the algorithm's required information to only noisy gradient estimates, i.e., partial feedback. We also consider a mini-batching `Meta-frankwolfe', and characterize its dynamic regret. Experiments on matrix completion problem and background separation in video demonstrate favorable performance of the proposed scheme. Moreover, the structured sparsity of FW is experimentally observed to yield the sharpest tracker of change points among alternative approaches to non-stationary online convex optimization.
We study a first-order method to find the minimum cross-sectional area ellipsoidal cylinder containing a finite set of points. This problem arises in optimal design in statistics when one is interested in a subset of ...
详细信息
We study a first-order method to find the minimum cross-sectional area ellipsoidal cylinder containing a finite set of points. This problem arises in optimal design in statistics when one is interested in a subset of the parameters. We provide convex formulations of this problem and its dual, and analyze a method based on the frank-wolfe algorithm for their solution. Under suitable conditions on the behavior of the method, we establish global and local convergence properties. However, difficulties may arise when a certain submatrix loses rank, and we describe a technique for dealing with this situation. (C) 2011 Elsevier B.V. All rights reserved.
Similarity and metric learning provides a principled approach to construct a task-specific similarity from weakly supervised data. However, these methods are subject to the curse of dimensionality: as the number of fe...
详细信息
Similarity and metric learning provides a principled approach to construct a task-specific similarity from weakly supervised data. However, these methods are subject to the curse of dimensionality: as the number of features grows large, poor generalization is to be expected and training becomes intractable due to high computational and memory costs. In this paper, we propose a similarity learning method that can efficiently deal with high-dimensional sparse data. This is achieved through a parameterization of similarity functions by convex combinations of sparse rank-one matrices, together with the use of a greedy approximate frank-wolfe algorithm which provides an efficient way to control the number of active features. We show that the convergence rate of the algorithm, as well as its time and memory complexity, are independent of the data dimension. We further provide a theoretical justification of our modeling choices through an analysis of the generalization error, which depends logarithmically on the sparsity of the solution rather than on the number of features. Our experiments on datasets with up to one million features demonstrate the ability of our approach to generalize well despite the high dimensionality as well as its superiority compared to several competing methods. (C) 2018 Elsevier B.V. All rights reserved.
In this paper, we propose and analyze an away-step frank-wolfe algorithm designed for solving multiobjective optimization problems over polytopes. We prove that each limit point of the sequence generated by the algori...
详细信息
In this paper, we propose and analyze an away-step frank-wolfe algorithm designed for solving multiobjective optimization problems over polytopes. We prove that each limit point of the sequence generated by the algorithm is a weak Pareto optimal solution. Furthermore, under additional conditions, we show linear convergence of the whole sequence to a Pareto optimal solution. Numerical examples illustrate a promising performance of the proposed algorithm in problems where the multiobjective frank-wolfe convergence rate is only sublinear.
We study the properties of the frank-wolfe algorithm to solve the m-EXACT-SPARSE reconstruction problem, where a signal y must be expressed as a sparse linear combination of a predefined set of atoms, called dictionar...
详细信息
We study the properties of the frank-wolfe algorithm to solve the m-EXACT-SPARSE reconstruction problem, where a signal y must be expressed as a sparse linear combination of a predefined set of atoms, called dictionary. We prove that when the signal is sparse enough with respect to the coherence of the dictionary, then the iterative process implemented by the frank-wolfe algorithm only recruits atoms from the support of the signal, is the smallest set of atoms from the dictionary that allows for a perfect reconstruction of y. We also prove that under this same condition, there exists an iteration beyond which the algorithm converges exponentially.
This letter aims to enhance the use of the frank-wolfe (FW) algorithm for training deep neural networks. Similar to any gradient-based optimization algorithm, FW suffers from high computational and memory costs when c...
详细信息
This letter aims to enhance the use of the frank-wolfe (FW) algorithm for training deep neural networks. Similar to any gradient-based optimization algorithm, FW suffers from high computational and memory costs when computing gradients for DNNs. This letter introduces the application of the recently proposed projected forward gradient (Projected-FG) method to the FW framework, offering reduced computational cost similar to backpropagation and low memory utilization akin to forward propagation. Our results show that trivial application of the Projected-FG introduces non-vanishing convergence error due to the stochastic noise that the Projected-FG method introduces in the process. This noise results in an non-vanishing variance in the Projected-FG estimated gradient. To address this, we propose a variance reduction approach by aggregating historical Projected-FG directions. We demonstrate rigorously that this approach ensures convergence to the optimal solution for convex functions and to a stationary point for non-convex functions. Simulations demonstrate our results.
With increasingly "big" data available in biomedical research, deriving accurate and reproducible biology knowledge from such big data imposes enormous computational challenges. In this paper, motivated by r...
详细信息
With increasingly "big" data available in biomedical research, deriving accurate and reproducible biology knowledge from such big data imposes enormous computational challenges. In this paper, motivated by recently developed stochastic block coordinate algorithms, we propose a highly scalable randomized block coordinate frank-wolfe algorithm for convex optimization with general compact convex constraints, which has diverse applications in analyzing biomedical data for better understanding cellular and disease mechanisms. We focus on implementing the derived stochastic block coordinate algorithm to align protein-protein interaction networks for identifying conserved functional pathways based on the IsoRank framework. Our derived stochastic block coordinate frank-wolfe (SBCFW) algorithm has the convergence guarantee and naturally leads to the decreased computational cost (time and space) for each iteration. Our experiments for querying conserved functional protein complexes in yeast networks confirm the effectiveness of this technique for analyzing large-scale biological networks.
This paper modifies the frank-wolfe's algorithm. Under weaker conditions it proves that the modified algorithm is convergent, and specially under the assumption of convexity of the objective function that without...
详细信息
This paper modifies the frank-wolfe's algorithm. Under weaker conditions it proves that the modified algorithm is convergent, and specially under the assumption of convexity of the objective function that without assuming {x ̄k} is bounded.
暂无评论