检索结果-内蒙古大学图书馆

learning Expected Emphatic Traces for Deep RL

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Jiang, Ray Zhang, Shangtong Chelu, Veronica White, Adam van Hasselt, Hado DeepMind London United Kingdom University of Oxford Oxford United Kingdom McGill University MontrealQC Canada DeepMind Edmonton Canada

Off-policy sampling and experience replay are key for improving sample efficiency and scaling model-free temporal difference learning methods. When combined with function approximation, such as neural networks, this combination is known as the deadly triad and is potentially unstable. Recently, it has been shown that stability and good performance at scale can be achieved by combining emphatic weightings and multi-step updates. This approach, however, is generally limited to sampling complete trajectories in order, to compute the required emphatic weighting. In this paper we investigate how to combine emphatic weightings with non-sequential, off-line data sampled from a replay buffer. We develop a multi-step emphatic weighting that can be combined with replay, and a time-reversed n-step TD learning algorithm to learn the required emphatic weighting. We show that these state weightings reduce variance compared with prior approaches, while providing convergence guarantees. We tested the approach at scale on Atari 2600 video games, and observed that the new X-ETD(n) agent improved over baseline agents, highlighting both the scalability and broad applicability of our approach. Copyright © 2021, The Authors. All rights reserved.

关键词： learning algorithms

Federated reconnaissance: Efficient, distributed, class-incremental learning

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Hendryx, Sean M. Dharma Raj, K.C. Walls, Bradley Morrison, Clayton T. School of Information University of Arizona Department of Computer Science University of Arizona Areté Associates

We describe federated reconnaissance, a class of learning problems in which distributed clients learn new concepts independently and communicate that knowledge efficiently. In particular, we propose an evaluation framework and methodological baseline for a system in which each client is expected to learn a growing set of classes and communicate knowledge of those classes efficiently with other clients, such that, after knowledge merging, the clients should be able to accurately discriminate between classes in the superset of classes observed by the set of clients. We compare a range of learning algorithms for this problem and find that prototypical networks are a strong approach in that they are robust to catastrophic forgetting while incorporating new information efficiently. Furthermore, we show that the online averaging of prototype vectors is effective for client model merging and requires only a small amount of communication overhead, memory, and update time per class with no gradient-based learning or hyperparameter tuning. Additionally, to put our results in context, we find that a simple, prototypical network with four convolutional layers significantly outperforms complex, state of the art continual learning algorithms, increasing the accuracy by over 22% after learning 600 Omniglot classes and over 33% after learning 20 mini-ImageNet classes incrementally. These results have important implications for federated reconnaissance and continual learning more generally by demonstrating that communicating feature vectors is an efficient, robust, and effective means for distributed, continual learning. © 2021, CC BY.

关键词： learning algorithms

Bayesian Active Meta-learning for Black-Box Optimization

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Nikoloska, Ivana Simeone, Osvaldo KCLIP CTR Dept. of Engineering King’s College London United Kingdom

Data-efficient learning algorithms are essential in many practical applications for which data collection is expensive, e.g., for the optimal deployment of wireless systems in unknown propagation scenarios. Meta-learning can address this problem by leveraging data from a set of related learning tasks, e.g., from similar deployment settings. In practice, one may have available only unlabeled data sets from the related tasks, requiring a costly labeling procedure to be carried out before use in meta-learning. For instance, one may know the possible positions of base stations in a given area, but not the performance indicators achievable with each deployment. To decrease the number of labeling steps required for meta-learning, this paper introduces an information-theoretic active task selection mechanism, and evaluates an instantiation of the approach for Bayesian optimization of black-box models. Copyright © 2021, The Authors. All rights reserved.

关键词： learning algorithms

Convergence and accuracy trade-offs in federated learning and meta-learning

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Charles, Zachary Konečný, Jakub Google Research United States

We study a family of algorithms, which we refer to as local update methods, generalizing many federated and meta-learning algorithms. We prove that for quadratic models, local update methods are equivalent to first-order optimization on a surrogate loss we exactly characterize. Moreover, fundamental algorithmic choices (such as learning rates) explicitly govern a trade-off between the condition number of the surrogate loss and its alignment with the true loss. We derive novel convergence rates showcasing these trade-offs and highlight their importance in communication-limited settings. Using these insights, we are able to compare local update methods based on their convergence/accuracy trade-off, not just their convergence to critical points of the empirical loss. Our results shed new light on a broad range of phenomena, including the efficacy of server momentum in federated learning and the impact of proximal client updates. Copyright © 2021, The Authors. All rights reserved.

关键词： learning algorithms

On sensitivity of meta-learning to support data

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Agarwal, Mayank Yurochkin, Mikhail Sun, Yuekai IBM Research MIT-IBM Watson AI Lab University of Michigan

Meta-learning algorithms are widely used for few-shot learning. For example, image recognition systems that readily adapt to unseen classes after seeing only a few labeled examples. Despite their success, we show that modern meta-learning algorithms are extremely sensitive to the data used for adaptation, i.e. support data. In particular, we demonstrate the existence of (unaltered, in-distribution, natural) images that, when used for adaptation, yield accuracy as low as 4% or as high as 95% on standard few-shot image classification benchmarks. We explain our empirical findings in terms of class margins, which in turn suggests that robust and safe meta-learning requires larger margins than supervised learning. Copyright © 2021, The Authors. All rights reserved.

关键词： learning algorithms

Reward function shape exploration in adversarial imitation learning: An empirical study

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Wang, Yawei Li, Xiu The Shenzhen International Graduate School Tsinghua University Shenzhen China

For adversarial imitation learning algorithms (AILs), no true rewards are obtained from the environment for learning the strategy. However, the pseudo rewards based on the output of the discriminator are still required. Given the implicit reward bias problem in AILs, we design several representative reward function shapes and compare their performances by large-scale experiments. To ensure our results' reliability, we conduct the experiments on a series of Mujoco and Box2D continuous control tasks based on four different AILs. Besides, we also compare the performance of various reward function shapes using varying numbers of expert trajectories. The empirical results reveal that the positive logarithmic reward function works well in typical continuous control tasks. In contrast, the so-called unbiased reward function is limited to specific kinds of tasks. Furthermore, several designed reward functions perform excellently in these environments as well. Copyright © 2021, The Authors. All rights reserved.

关键词： learning algorithms

A mathematical foundation for robust machine learning based on bias-variance trade-off

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Wu, Ou Zhu, Weiyao Deng, Yingjun Zhang, Haixiang Hou, Qinghu Center for Applied Mathematics Tianjin University Department of Applied Mathematics INSA Rouen France

A common assumption in machine learning is that samples are independently and identically distributed (i.i.d). However, the contributions of different samples are not identical in training. Some samples are difficult to learn and some samples are noisy. The unequal contributions of samples has a considerable effect on training performances. Studies focusing on unequal sample contributions (e.g., easy, hard, noisy) in learning usually refer to these contributions as robust machine learning (RML). Weighing and regularization are two common techniques in RML. Numerous learning algorithms have been proposed but the strategies for dealing with easy/hard/noisy samples differ or even contradict with different learning algorithms. For example, some strategies take the hard samples first, whereas some strategies take easy first. Conducting a clear comparison for existing RML algorithms in dealing with different samples is difficult due to lack of a unified theoretical framework for RML. This study attempts to construct a mathematical foundation for RML based on the bias-variance trade-off theory. A series of definitions and properties are presented and proved. Several classical learning algorithms are also explained and compared. Improvements of existing methods are obtained based on the comparison. A unified method that combines two classical learning strategies is proposed. Copyright © 2021, The Authors. All rights reserved.

关键词： learning algorithms

Rethinking Image-Scaling Attacks: The Interplay Between Vulnerabilities in Machine learning Systems

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Gao, Yue Shumailov, Ilia Fawaz, Kassem University of Wisconsin-Madison MadisonWI United States Vector Institute TorontoON Canada

As real-world images come in varying sizes, the machine learning model is part of a larger system that includes an upstream image scaling algorithm. In this paper, we investigate the interplay between vulnerabilities of the image scaling procedure and machine learning models in the decision-based black-box setting. We propose a novel sampling strategy to make a black-box attack exploit vulnerabilities in scaling algorithms, scaling defenses, and the final machine learning model in an end-to-end manner. Based on this scaling-aware attack, we reveal that most existing scaling defenses are ineffective under threat from downstream models. Moreover, we empirically observe that standard black-box attacks can significantly improve their performance by exploiting the vulnerable scaling procedure. We further demonstrate this problem on a commercial Image Analysis API with decision-based black-box attacks. © 2021, CC BY-NC-ND.

关键词： learning algorithms

A New Approach for Active Automata learning Based on Apartness

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Vaandrager, Frits Garhewal, Bharat Rot, Jurriaan Wißmann, Thorsten Institute for Computing and Information Sciences Radboud University Nijmegen Netherlands

We present L#, a new and simple approach to active automata learning. Instead of focusing on equivalence of observations, like the L∗algorithm and its descendants, L#takes a different perspective: it tries to establish apartness, a constructive form of inequality. L#does not require auxiliary notions such as observation tables or discrimination trees, but operates directly on tree-shaped automata. L#has the same asymptotic query and symbol complexities as the best existing learning algorithms, but we show that adaptive distinguishing sequences can be naturally integrated to boost the performance of L#in practice. Experiments with a prototype implementation, written in Rust, suggest that L#is competitive with existing algorithms. © 2021, CC BY.

关键词： learning algorithms