检索结果-内蒙古大学图书馆

Reinforcement Learning Solutions to Stochastic Multi-Agent Graphical Games With Multiplicative Noise

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS 2025年

作者： Zhang, Hanfang Wang, Bing-Chang Cao, Ying Shandong Univ Sch Control Sci & Engn Jinan 250061 Peoples R China

This paper investigates reinforcement learning algorithms for discrete-time stochastic multi-agent graphical games with multiplicative noise. The Bellman optimality equation for stochastic multi-agent graphical games is obtained by using the optimality principle. A Nash equilibrium can be reached when each agent executes a strategy in terms of Bellman optimality equation. To circumvent the difficulty of solving the coupled Bellman equation, a value iteration heuristic dynamic programming (hdp) algorithm is designed and its convergence is shown. To solve multi-agent graphical games online, the hdp algorithm based on the actor-critic framework is designed to approximate Nash equilibrium solutions. The effectiveness of the algorithm is verified by two numerical simulation examples.

关键词： Games Heuristic algorithms Mathematical models Noise Multi-agent systems Approximation algorithms Reinforcement learning Nash equilibrium Vectors Optimal control Multi-agent graphical games Bellman optimality equation hdp algorithm

来源：评论

学校读者我要写书评

暂无评论

Discrete-time Stochastic Multi-agent Graphical Games with Multiplicative Noise:Reinforcement Learning Solutions

Discrete-time Stochastic Multi-agent Graphical Games with Mu...

引用

第43届中国控制会议

作者： Hanfang Zhang Bingchang Wang Shandong University

ISBN: (数字)9789887581581

ISBN: (纸本)9798350366907

This paper considers a discrete time stochastic multi-agent graphical games problem with multiplicative noise based on reinforcement learning. The Bellman optimality equations for multi-agent graphical games are obtained by using the optimality principle. Through the stability analysis, it is proved that the solutions of the equations converge to Nash equilibrium. Since the coupled Bellman equation is difficult to solve, the value iteration heuristic dynamic programming(hdp) algorithm is *** solve multi-agent graphical games online, the hdp algorithm based on the actor-critic framework is designed to find the approximate solutions, and the effectiveness of the algorithm is verified by a simulation example.

关键词： Multi-agent graphical games Bellman optimality equation Nash equilibrium hdp algorithm

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：