We give nearly matching upper and lower bounds on the oracle complexity of finding epsilon-stationary points (||del F(x)|| <= epsilon) in stochastic convex optimization. We jointly analyze the oracle complexity in ...
详细信息
We give nearly matching upper and lower bounds on the oracle complexity of finding epsilon-stationary points (||del F(x)|| <= epsilon) in stochastic convex optimization. We jointly analyze the oracle complexity in both the local stochastic oracle model and the global oracle (or, statistical learning) model. This allows us to decompose the complexity of finding near-stationary points into optimization complexity and sample complexity, and reveals some surprising differences between the complexity of stochastic optimization versus learning. Notably, we show that in the global oracle/statistical learning model, only logarithmic dependence on smoothness is required to find a near-stationary point, whereas polynomial dependence on smoothness is necessary in the local stochastic oracle model. In other words, the separation in complexity between the two models can be exponential, and the folklore understanding that smoothness is required to find stationary points is only weakly true for statistical learning. Our upper bounds are based on extensions of a recent "recursive regularization" technique proposed by Allen-Zhu (2018). We show how to extend the technique to achieve near-optimal rates, and in particular show how to leverage the extra information available in the global oracle model. Our algorithm for the global model can be implemented efficiently through finite sum methods, and suggests an interesting new computational-statistical tradeoff.
Linear programming is a foundational tool for many aspects of integer and combinatorial optimization. This work studies the complexity of solving linear programs exactly over the rational numbers through use of an ora...
详细信息
ISBN:
(纸本)9783030179533;9783030179526
Linear programming is a foundational tool for many aspects of integer and combinatorial optimization. This work studies the complexity of solving linear programs exactly over the rational numbers through use of an oracle capable of returning limited-precision LP solutions. Under mild assumptions, it is shown that a polynomial number of calls to such an oracle and a polynomial number of bit operations, is sufficient to compute an exact solution to an LP. Previous work has often considered oracles that provide solutions of an arbitrary specified precision. While this leads to polynomial-time algorithms, the level of precision required is often unrealistic for practical computation. In contrast, our work provides a foundation for understanding and analyzing the behavior of the methods that are currently most effective in practice for solving LPs exactly.
The (optimal) function/gradient evaluations worst-case complexity analysis available for the adaptive regularization algorithms with cubics (ARC) for nonconvex smooth unconstrained optimization is extended to finite-d...
详细信息
The (optimal) function/gradient evaluations worst-case complexity analysis available for the adaptive regularization algorithms with cubics (ARC) for nonconvex smooth unconstrained optimization is extended to finite-difference versions of this algorithm, yielding complexity bounds for first-order and derivative-free methods applied on the same problem class. A comparison with the results obtained for derivative-free methods by Vicente [Worst Case complexity of Direct Search, Technical report, Preprint 10-17, Department of Mathematics, University of Coimbra, Coimbra, Portugal, 2010] is also discussed, giving some theoretical insight into the relative merits of various methods in this popular class of algorithms.
We propose a near-optimal method for highly smooth convex optimization. More precisely, in the oracle model where one obtains the pth order Taylor expansion of a function at the query point, we propose a method with r...
详细信息
We propose a near-optimal method for highly smooth convex optimization. More precisely, in the oracle model where one obtains the pth order Taylor expansion of a function at the query point, we propose a method with rate of convergence (O) over tilde (1/k(3p+1/2)) after k queries to the oracle for any convex function whose pth order derivative is Lipschitz.
Relative to the large literature on upper bounds on complexity of convex optimization, lesser attention has been paid to the fundamental hardness of these problems. Given the extensive use of convex optimization in ma...
详细信息
Relative to the large literature on upper bounds on complexity of convex optimization, lesser attention has been paid to the fundamental hardness of these problems. Given the extensive use of convex optimization in machine learning and statistics, gaining an understanding of these complexity-theoretic issues is important. In this paper, we study the complexity of stochastic convex optimization in an oracle model of computation. We introduce a new notion of discrepancy between functions, and use it to reduce problems of stochastic convex optimization to statistical parameter estimation, which can be lower bounded using information-theoretic methods. Using this approach, we improve upon known results and obtain tight minimax complexity estimates for various function classes.
Recent large-scale deployments of differentially private algorithms employ the local model for privacy (sometimes called PRAM or randomized response), where data are randomized on each individual's device before b...
详细信息
ISBN:
(纸本)9781509055333
Recent large-scale deployments of differentially private algorithms employ the local model for privacy (sometimes called PRAM or randomized response), where data are randomized on each individual's device before being sent to a server that computes approximate, aggregate statistics. The server need not be trusted for privacy, leaving data control in users' hands. For an important class of convex optimization problems (including logistic regression, support vector machines, and the Euclidean median), the best known locally differentially-private algorithms are highly interactive, requiring as many rounds of back and forth as there are users in the protocol. We ask: how much interaction is necessary to optimize convex functions in the local DP model? Existing lower bounds either do not apply to convex optimization, or say nothing about interaction. We provide new algorithms which are either noninteractive or use relatively few rounds of interaction. We also show lower bounds on the accuracy of an important class of noninteractive algorithms, suggesting a separation between what is possible with and without interaction.
The problem of factoring integers in polynomial time with the help of an infinitely powerful oracle who answers arbitrary questions with yes or no is considered. The goal is to minimize the number of oracle questions....
详细信息
The problem of factoring integers in polynomial time with the help of an infinitely powerful oracle who answers arbitrary questions with yes or no is considered. The goal is to minimize the number of oracle questions. Let N be a given composite n-bit integer to be factored, where n = [log(2) N]. The trivial method of asking for the bits of the smallest prime factor of N requires n/2 questions in the worst case. A non-trivial algorithm of Rivest and Shamir requires only n/3 questions for the special case where N is the product of two n/2-bit primes. In this paper, a polynomial-time oracle factoring algorithm for general integers is presented which, for any epsilon > 0, asks at most En oracle questions for sufficiently large N, thus solving an open problem posed by Rivest and Shamir. Based on a plausible conjecture related to Lenstra's conjecture on the running time of the elliptic curve factoring algorithm, it is shown that the algorithm fails with probability at most N-(epsilon/2) for all sufficiently large N.
暂无评论