版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Maastricht University Department of Quantitative Economics P.O. Box 616 MaastrichtNL–6200 MD Netherlands Univ. Grenoble Alpes CNRS Inria LIG Grenoble38000 France Criteo AI Lab Maastricht University Department of Data Science and Knowledge Engineering P.O. Box 616 MaastrichtNL–6200 MD Netherlands
出 版 物:《arXiv》 (arXiv)
年 卷 期:2018年
核心收录:
摘 要:We examine the long-run behavior of multi-agent online learning in games that evolve over time. Specifically, we focus on a wide class of policies based on mirror descent, and we show that the induced sequence of play (a) converges to Nash equilibrium in time-varying games that stabilize in the long run to a strictly monotone limit;and (b) it stays asymptotically close to the evolving equilibrium of the sequence of stage games (assuming they are strongly monotone). Our results apply to both gradient-based and payoff-based feedback – i.e., when players only get to observe the payoffs of their chosen *** Codes Primary 91A10, 91A26, secondary 68Q32, 68T02 Copyright © 2018, The Authors. All rights reserved.