We study the intrinsic limitations of sequential convex optimization through the lens of feedback information theory. In the oracle model of optimization, an algorithm queries an oracle for noisy information about the...
详细信息
We study the intrinsic limitations of sequential convex optimization through the lens of feedback information theory. In the oracle model of optimization, an algorithm queries an oracle for noisy information about the unknown objective function and the goal is to (approximately) minimize every function in a given class using as few queries as possible. We show that, in order for a function to be optimized, the algorithm must be able to accumulate enough information about the objective. This, in turn, puts limits on the speed of optimization under specific assumptions on the oracle and the type of feedback. Our techniques are akin to the ones used in statistical literature to obtain minimax lower bounds on the risks of estimation procedures;the notable difference is that, unlike in the case of i.i.d. data, a sequential optimization algorithm can gather observations in a controlled manner, so that the amount of information at each step is allowed to change in time. In particular, we show that optimization algorithms often obey the law of diminishing returns: the signal-to-noise ratio drops as the optimization algorithm approaches the optimum. To underscore the generality of the tools, we use our approach to derive fundamental lower bounds for a certain active learning problem. Overall, the present work connects the intuitive notions of "information" in optimization, experimental design, estimation, and active learning to the quantitative notion of Shannon information.
We obtain a new lower bound on the information-based complexity of first-order minimization of smooth and convex functions. We show that the bound matches the worst-case performance of the recently introduced Optimize...
详细信息
We obtain a new lower bound on the information-based complexity of first-order minimization of smooth and convex functions. We show that the bound matches the worst-case performance of the recently introduced Optimized Gradient Method (Drori and Teboulle, 2013;Kim and Fessler, 2015), thereby establishing that the bound is tight and can be realized by an efficient algorithm. The proof is based on a novel construction technique of smooth and convex functions. (C) 2016 Elsevier Inc. All rights reserved.
The information-based complexity of optimal reconstruction problems for general nonlinear systems is studied. Both lower and upper bounds for the worst-case reconstruction-error functional are given. The existence of ...
详细信息
The information-based complexity of optimal reconstruction problems for general nonlinear systems is studied. Both lower and upper bounds for the worst-case reconstruction-error functional are given. The existence of an optimal reconstruction algorithm which achieves the infimum of the reconstruction-error is established.
We explain why information-based complexity uses the real number model. Results in the real number model are essentially the same as in floating point arithmetic with fixed precision modulo two important assumptions, ...
详细信息
We explain why information-based complexity uses the real number model. Results in the real number model are essentially the same as in floating point arithmetic with fixed precision modulo two important assumptions, namely . we use only stable algorithms, . the approximation error is not too small, compared to the product of the condition number, the roundoff unit of floating point arithmetic, and the accumulation constant of a stable algorithm. We illustrate this by an example of solving nonlinear equations by bisection. We also indicate the possible tradeoffs between complexity and stability, and the need of using multiple or varying precision for ill-conditioned problems. (C) 1999 Elsevier Science B.V. All rights reserved.
The exact order of complexity of weakly singular integral equations including logarithmic singularities with periodic and analytic coefficients is found. This class of equations contains the boundary equations of exte...
详细信息
The exact order of complexity of weakly singular integral equations including logarithmic singularities with periodic and analytic coefficients is found. This class of equations contains the boundary equations of exterior boundary value problems for the two-dimensional Helmholtz equation.
Existing complexity results in stochastic linear programming using the Turing model depend only on problem dimensionality. We apply techniques from the information-based complexity literature to show that the smoothne...
详细信息
Existing complexity results in stochastic linear programming using the Turing model depend only on problem dimensionality. We apply techniques from the information-based complexity literature to show that the smoothness of the recourse function is just as important. We derive approximation error bounds for the recourse function of two-stage stochastic linear programs and show that their worst case is exponential and depends on the solution tolerance, the dimensionality of the uncertain parameters and the smoothness of the recourse function. (C) 2013 Elsevier B.V. All rights reserved.
In this paper, we give a novel insight into studying the low level vision. We attack low level problems by parallel information-based complexity techniques. Here we only study the visual surface reconstruction. We obt...
详细信息
In this paper, we give a novel insight into studying the low level vision. We attack low level problems by parallel information-based complexity techniques. Here we only study the visual surface reconstruction. We obtain tight bounds on the complexity, considered as a function of two variables simultaneously: the number of processors, the required precision. This result seems to be new even in serial case. Our results will provide a benchmark for the intrinsic difficulty of visual surface reconstruction. With this, one can compare its computational complexity with the cost any algorithm that solves the problem to tell how well the given algorithm measures up. To the best of our knowledge, it is the first attempt to introduce parallel information-based complexity techniques into studying low level vision.
Suppose x is an unknown vector in R-m (a digital image or signal);we plan to measure n general linear functionals of x and then reconstruct. If x is known to be compressible by transform coding with a known transform,...
详细信息
Suppose x is an unknown vector in R-m (a digital image or signal);we plan to measure n general linear functionals of x and then reconstruct. If x is known to be compressible by transform coding with a known transform, and we reconstruct via the nonlinear procedure defined here, the number of measurements n can be dramatically smaller than the size m. Thus, certain natural classes of images with m pixels need only n = O(m(1/4) log(5/2)(m)) nonadaptive nonpixel samples for faithful recovery, as opposed to the usual m pixel samples. More specifically, suppose x has a sparse representation in some orthonormal basis (e.g., wavelet, Fourier) or tight frame (e.g., curvelet, Gabor)-so the coefficients belong to an l(p), ball for O < p <= 1. The N most important coefficients in that expansion allow reconstruction with l(2) error O(N1/2-1/p). It is possible to design n = O (N log (m)) nonadaptive measurements allowing reconstruction with accuracy comparable to that attainable with direct knowledge of the N most important coefficients. Moreover, a good approximation to those N important coefficients is extracted from the n measurements by solving a linear program-Basis Pursuit in signal processing. The nonadaptive measurements have the character of "random" linear combinations of basis/frame elements. Our results use the notions of optimal recovery, of n-widths, and information-based complexity. We estimate the Gel'fand n-widths of l(p) balls in high-dimensional Euclidean space in the case 0 < p <= 1, and give a criterion identifying near-optimal subspaces for Gel'fand n-widths. We show that "most" subspaces are near-optimal, and show that convex optimization (Basis Pursuit) is a near-optimal way to extract information derived from these near-optimal subspaces.
We study the approximate solution of initial value problems for parameter dependent finite or infinite systems of scalar ordinary differential equations (ODEs). Both the deterministic and the randomized setting is con...
详细信息
We study the approximate solution of initial value problems for parameter dependent finite or infinite systems of scalar ordinary differential equations (ODEs). Both the deterministic and the randomized setting is considered, with input data from various smoothness classes. We study deterministic and Monte Carlo multilevel algorithms and derive convergence rates. Moreover, we prove their optimality by showing matching (in some limit cases up to logarithmic factors) lower bounds and settle this way the complexity. Comparisons between the deterministic and randomized setting are given, as well. (C) 2015 International Association for Mathematics and Computers in Simulation (IMACS). Published by Elsevier B.V. All rights reserved.
We consider the oracle complexity of computing an approximate stationary point of a Lipschitz function. When the function is smooth, it is well known that the simple deterministic gradient method has finite dimension-...
详细信息
We consider the oracle complexity of computing an approximate stationary point of a Lipschitz function. When the function is smooth, it is well known that the simple deterministic gradient method has finite dimension-free oracle complexity. However, when the function can be nonsmooth, it is only recently that a randomized algorithm with finite dimension-free oracle complexity has been developed. In this paper, we show that no deterministic algorithm can do the same. Moreover, even without the dimension-free requirement, we show that any finite-time deterministic method cannot be general zero-respecting. In particular, this implies that a natural derandomization of the aforementioned randomized algorithm cannot have finite-time complexity. Our results reveal a fundamental hurdle in modern large-scale nonconvex nonsmooth optimization.
暂无评论