The proceedings contain 140 papers. The topics discussed include: Bayesian multi-population haplotype inference via a hierarchical dirichlet process mixture;discriminative unsupervised learning of structured predictor...
详细信息
ISBN:
(纸本)1595933832
The proceedings contain 140 papers. The topics discussed include: Bayesian multi-population haplotype inference via a hierarchical dirichlet process mixture;discriminative unsupervised learning of structured predictors;active learning via orthogonal linear discriminant analysis;active learning via transductive experimental design;collaborative ordinal regression;efficient lazy elimination for averaged one-dependence estimators;Bayesian learning of measurement and structural models;deterministic annealing for semi-supervised kernel machines;feature subset selection bias for classification learning;Bayesian pattern ranking for move prediction in the game of go;experience-efficient learning in associative bandit problems;learning low-rank kernel matrices;simpler knowledge-based support vector machines;a probabilistic model for text kernels;nonstationary kernel combination;and region-based value iteration for partially observable Markov decision processes.
The proceedings contain 140 papers. The topics discussed include: using inaccurate models in reinforcement learning;algorithms for portfolio management based on the Newton method;higher order learning with graphs;rank...
详细信息
ISBN:
(纸本)1595933832
The proceedings contain 140 papers. The topics discussed include: using inaccurate models in reinforcement learning;algorithms for portfolio management based on the Newton method;higher order learning with graphs;ranking on graph data;robust probabilistic projections;a DC-programming algorithm for kernel selection;relational temporal difference learning;a new approach to data driven clustering;agnostic active learning;on a theory of learning with similarity functions;on Bayesian bounds;convex optimization techniques for fitting sparse Gaussian graphical models;cover trees for nearest neighbor;graph model selection using maximum likelihood;dynamic topic models;predictive search distributions;learning predictive state representations using non-blind policies;efficient co-regularised least squares regression;semi-supervised learning for structured output variables;and fast nonparametric clustering with Gaussian blurring mean-shift.
We present the first large-scale empirical application of reinforcement learning to the important problem of optimized trade execution in modern financial markets. Our experiments are based on 1.5 years of millisecond...
详细信息
ISBN:
(纸本)1595933832
We present the first large-scale empirical application of reinforcement learning to the important problem of optimized trade execution in modern financial markets. Our experiments are based on 1.5 years of millisecond time-scale limit order data from NASDAQ, and demonstrate the promise of reinforcement learning methods to market microstructure problems. Our learning algorithm introduces and exploits a natural "low-impact" factorization of the state space.
Convex learning algorithms, such as Support Vector machines (SVMs), are often seen as highly desirable because they offer strong practical properties and are amenable to theoretical analysis. However, in this work we ...
详细信息
ISBN:
(纸本)1595933832
Convex learning algorithms, such as Support Vector machines (SVMs), are often seen as highly desirable because they offer strong practical properties and are amenable to theoretical analysis. However, in this work we show how non-convexity can provide scalability advantages over convexity. We show how concave-convex programming can be applied to produce (i) faster SVMs where training errors are no longer support vectors, and (ii) much faster Transductive SVMs.
How should a reinforcement learning agent act if its sole purpose is to efficiently learn an optimal policy for later use? In other words, how should it explore, to be able to exploit later? We formulate this problem ...
详细信息
ISBN:
(纸本)1595933832
How should a reinforcement learning agent act if its sole purpose is to efficiently learn an optimal policy for later use? In other words, how should it explore, to be able to exploit later? We formulate this problem as a Markov Decision Process by explicitly modeling the internal state of the agent and propose a principled heuristic for its solution. We present experimental results in a number of domains, also exploring the algorithm's use for learning a policy for a skill given its reward function-an important but neglected component of skill discovery.
I consider the setting of transductive learning of vertex labels in graphs, in which a graph with n vertices is sampled according to some unknown distribution;there is a true labeling of the vertices such that each ve...
详细信息
ISBN:
(纸本)1595933832
I consider the setting of transductive learning of vertex labels in graphs, in which a graph with n vertices is sampled according to some unknown distribution;there is a true labeling of the vertices such that each vertex is assigned to exactly one of k classes, but the labels of only some (random) subset of the vertices are revealed to the learner. The task is then to find a labeling of the remaining (unlabeled) vertices that agrees as much as possible with the true labeling. Several existing algorithms are based on the assumption that adjacent vertices are usually labeled the same. In order to better under-stand algorithms based on this assumption, I derive data-dependent bounds on the fraction of mislabeled vertices, based on the number (or total weight) of edges between vertices differing in predicted label (i.e., the size of the cut).
We show that several important Bayesian bounds studied in machinelearning, both in the batch as well as the online setting, arise by an application of a simple compression lemma. In particular, we derive (i) PAC-Baye...
详细信息
ISBN:
(纸本)1595933832
We show that several important Bayesian bounds studied in machinelearning, both in the batch as well as the online setting, arise by an application of a simple compression lemma. In particular, we derive (i) PAC-Bayesian bounds in the batch setting, (ii) Bayesian log-loss bounds and (iii) Bayesian bounded-loss bounds in the online setting using the compression lemma. Although every setting has different semantics for prior, posterior and loss, we show that the core bound argument is the same. The paper simplifies our understanding of several important and apparently disparate results, as well as brings to light a powerful tool for developing similar arguments for other methods.
We introduce the use of learned shaping rewards in reinforcement learning tasks, where an agent uses prior experience on a sequence of tasks to learn a portable predictor that estimates intermediate rewards, resulting...
详细信息
ISBN:
(纸本)1595933832
We introduce the use of learned shaping rewards in reinforcement learning tasks, where an agent uses prior experience on a sequence of tasks to learn a portable predictor that estimates intermediate rewards, resulting in accelerated learning in later tasks that are related but distinct. Such agents can be trained on a sequence of relatively easy tasks in order to develop a more informative measure of reward that can be transferred to improve performance on more difficult tasks with-out requiring a hand coded shaping function. We use a rod positioning task to show that this significantly improves performance even after a very brief training period.
Kernel learning plays an important role in many machinelearning tasks. However, algorithms for learning a kernel matrix often scale poorly, with running times that are cubic in the number of data points. In this pape...
详细信息
ISBN:
(纸本)1595933832
Kernel learning plays an important role in many machinelearning tasks. However, algorithms for learning a kernel matrix often scale poorly, with running times that are cubic in the number of data points. In this paper, we propose efficient algorithms for learning lowrank kernel matrices;our algorithms scale linearly in the number of data points and quadratically in the rank of the kernel. We introduce and employ Bregman matrix divergences for rank-deficient matrices - these divergences are natural for our problem since they preserve the rank as well as positive semi-definiteness of the kernel matrix. Special cases of our framework yield faster algorithms for various existing kernel learning problems. Experimental results demonstrate the effectiveness of our algorithms in learning both low-rank and full-rank kernels.
暂无评论