We address the problem of reference-based compressed sensing: reconstruct a sparse signal from few linear measurements using as prior information a reference signal, a signal similar to the signal we want to reconstru...
详细信息
ISBN:
(纸本)9781479999880
We address the problem of reference-based compressed sensing: reconstruct a sparse signal from few linear measurements using as prior information a reference signal, a signal similar to the signal we want to reconstruct. Access to reference signals arises in applications such as medical imaging, e.g., through prior images of the same patient, and compressive video, where previously reconstructed frames can be used as reference. Our goal is to use the reference signal to reduce the number of required measurements for reconstruction. We achieve this via a reweighted l(1)-l(1) minimization scheme that updates its weights based on a sample complexity bound. The scheme is simple, intuitive and, as our experiments show, outperforms prior algorithms, including reweighted l(1) minimization, l(1)-l(1) minimization, and modified CS.
Recently there is a surge of interest in understanding the horizon-dependence of the sample complexity in reinforcement learning (RL). Notably, for an RL environment with horizon length H, previous work have shown tha...
详细信息
ISBN:
(纸本)9781665420556
Recently there is a surge of interest in understanding the horizon-dependence of the sample complexity in reinforcement learning (RL). Notably, for an RL environment with horizon length H, previous work have shown that there is a probably approximately correct (PAC) algorithm that learns an O(1)-optimal policy using polylog(H) episodes of environment interactions when the number of states and actions is fixed. It is yet unknown whether the polylog(H) dependence is necessary or not. In this work, we resolve this question by developing an algorithm that achieves the same PAC guarantee while using only O(1) episodes of environment interactions, completely settling the horizon-dependence of the sample complexity in RL. We achieve this bound by (i) establishing a connection between value functions in discounted and finite-horizon Markov decision processes (MDPs) and (ii) a novel perturbation analysis in MDPs. We believe our new techniques are of independent interest and could be applied in related questions in RL.
This paper settles the sample complexity of single-parameter revenue maximization by showing matching upper and lower bounds, up to a poly-logarithmic factor, for all families of value distributions that have been con...
详细信息
ISBN:
(纸本)9781450367059
This paper settles the sample complexity of single-parameter revenue maximization by showing matching upper and lower bounds, up to a poly-logarithmic factor, for all families of value distributions that have been considered in the literature. The upper bounds are unified under a novel framework, which builds on the strong revenue monotonicity by Devanur, Huang, and Psomas (STOC 2016), and an information theoretic argument. This is fundamentally different from the previous approaches that rely on either constructing an.-net of the mechanism space, explicitly or implicitly via statistical learning theory, or learning an approximately accurate version of the virtual values. To our knowledge, it is the first time information theoretical arguments are used to show sample complexity upper bounds, instead of lower bounds. Our lower bounds are also unified under a meta construction of hard instances.
Traditionally, the Bayesian optimal auction design problem has been considered either when the bidder values are i.i.d, or when each bidder is individually identifiable via her value distribution. The latter is a reas...
详细信息
ISBN:
(纸本)9781450341325
Traditionally, the Bayesian optimal auction design problem has been considered either when the bidder values are i.i.d, or when each bidder is individually identifiable via her value distribution. The latter is a reasonable approach when the bidders can be classified into a few categories, but there are many instances where the classification of bidders is a continuum. For example, the classification of the bidders may be based on their annual income, their propensity to buy an item based on past behavior, or in the case of ad auctions, the click through rate of their ads. We introduce an alternate model that captures this aspect, where bidders are a priori identical, but can be distinguished based (only) on some side information the auctioneer obtains at the time of the auction. We extend the sample complexity approach of Dhangwatnotai et al. and Cole and Roughgarden to this model and obtain almost matching upper and lower bounds. As an aside, we obtain a revenue monotonicity lemma which may be of independent interest. We also show how to use Empirical Risk Minimization techniques to improve the sample complexity bound of Cole and Roughgarden for the nonidentical but independent value distribution case.
作者:
Zhou, ZhengyuLiu, WeiweiWuhan Univ
Hubei Key Lab Multimedia & Network Commun Engn Inst Artificial Intelligence Natl Engn Res Ctr Multimedia SoftwareSch Comp Sc Wuhan Peoples R China
This paper investigates the sample complexity of learning a distributionally robust predictor under a particular distributional shift based on chi(2)-divergence, which is well known for its computational feasibility a...
详细信息
This paper investigates the sample complexity of learning a distributionally robust predictor under a particular distributional shift based on chi(2)-divergence, which is well known for its computational feasibility and statistical properties. We demonstrate that any hypothesis class H with finite VC dimension is distributionally robustly learnable. Moreover, we show that when the perturbation size is smaller than a constant, finite VC dimension is also necessary for distributionally robust learning by deriving a lower bound of sample complexity in terms of VC dimension.
Quantum computers hold unprecedented potentials for machine learning applications. Here, we prove that physical quantum circuits are probably approximately correct learnable on a quantum computer via empirical risk mi...
详细信息
Quantum computers hold unprecedented potentials for machine learning applications. Here, we prove that physical quantum circuits are probably approximately correct learnable on a quantum computer via empirical risk minimization: to learn a parametric quantum circuit with at most n(c) gates and each gate acting on a constant number of qubits, the sample complexity is bounded by O(n(c+1)). In particular, we explicitly construct a family of variational quantum circuits with O(n(c+1)) elementary gates arranged in a fixed pattern, which can represent all physical quantum circuits consisting of at most n` elementary gates. Our results provide a valuable guide for quantum machine learning in both theory and practice.
Machine learning has emerged as a leading force in revolutionizing technology, education, and almost every aspect of our lives. Reinforcement learning is a sub-field of machine learning that deals with the effects of ...
详细信息
Machine learning has emerged as a leading force in revolutionizing technology, education, and almost every aspect of our lives. Reinforcement learning is a sub-field of machine learning that deals with the effects of dynamic feedback and systems that interact with the environment. In these settings, classic statistical and algorithmic guarantees often do not hold because of non i.i.d. data, dynamic feedback, and distribution shift. We develop a framework for single trajectory learning of nonlinear dynamical systems using mixing arguments. Our main result studies the landscape of empirical risk minimization for learning nonlinear dynamical systems from a single trajectory, and provides uniform gradient convergence guarantee, which is combined with novel one-point convexity to facilitate the learning of nonlinear dynamical systems. Our proposed framework allows for non-convex loss landscape and our sample complexity and statistical error rates are optimal in terms of the trajectory length, dimensions of the system and input/noise strength. Next, we study the problem of learning bilinear dynamical systems from a single trajectory of the system’s states and inputs. Our main contribution is the application of martingale small-ball arguments to derive learning guarantees for non-mixing bilinear dynamical systems. We further extend our analysis to time varying dynamical systems by studying the problem of learning non-mixing Markov jump systems. Specifically, we learn the dynamics in each mode and the Markov transition matrix, underlying the evolution of the mode switches, from a single trajectory of the system’s states, inputs, and modes. Our sample complexity and statistical error rates are optimal in terms of the trajectory length, the dimensions of the system and the input/noise strength. Lastly, as a preliminary to the problem of finding the best LTI dynamical system that can minimize least-squares loss given a single trajectory of an unknown dynamical system, we study
In semi-supervised multi-view learning, unlabeled sample complexity (u.s.c.) specifies the size of unlabeled training sample that guarantees a desired learning error. In this paper, we improve the state-of-art u.s.c. ...
详细信息
ISBN:
(纸本)9781450336642
In semi-supervised multi-view learning, unlabeled sample complexity (u.s.c.) specifies the size of unlabeled training sample that guarantees a desired learning error. In this paper, we improve the state-of-art u.s.c. from O(1/epsilon) to O(log 1/epsilon) for small error epsilon, under mild conditions. To obtain the improved result, as a primary step we prove a connection between the generalization error of a classifier and its incompatibility, which measures the fitness between the classifier and the sample distribution. We then prove that with a sufficiently large unlabeled sample, one is able to find classifiers with low incompatibility. Combining the two observations, we manage to prove a probably approximately correct (PAC) style learning bound for semi-supervised multi-view learning. We empirically verified our theory by designing two proof-of concept algorithms, one based on active view sensing and the other based on online co-regularization, with real-world data sets.
In the synthesis model signals are represented as a sparse combinations of atoms from a dictionary. Dictionary learning describes the acquisition process of the underlying dictionary for a given set of training sample...
详细信息
ISBN:
(纸本)9781479949755
In the synthesis model signals are represented as a sparse combinations of atoms from a dictionary. Dictionary learning describes the acquisition process of the underlying dictionary for a given set of training samples. While ideally this would be achieved by optimizing the expectation of the factors over the underlying distribution of the training data, in practice the necessary information about the distribution is not available. Therefore, in real world applications it is achieved by minimizing an empirical average over the available samples. The main goal of this paper is to provide a sample complexity estimate that controls to what extent the empirical average deviates from the cost function. This estimate then provides a suitable estimate to the accuracy of the representation of the learned dictionary. The presented approach exemplifies the general results proposed by the authors in [1] and gives more concrete bounds of the sample complexity of dictionary learning. We cover a variety of sparsity measures employed in the learning procedure.
作者:
Zhengyu ZhouWeiwei LiuSchool of Computer Science
National Engineering Research Center for Multimedia Software Institute of Artificial Intelligence Hubei Key Laboratory of Multimedia and Network Communication Engineering Wuhan University Wuhan China
This paper investigates the sample complexity of learning a distributionally robust predictor under a particular distributional shift based on χ2-divergence, which is well known for its computational feasibility and ...
详细信息
This paper investigates the sample complexity of learning a distributionally robust predictor under a particular distributional shift based on χ2-divergence, which is well known for its computational feasibility and statistical properties. We demonstrate that any hypothesis class ℌ with finite VC dimension is distributionally robustly learnable. Moreover, we show that when the perturbation size is smaller than a constant, finite VC dimension is also necessary for distributionally robust learning by deriving a lower bound of sample complexity in terms of VC dimension.
暂无评论