咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Average reward adjusted discou... 收藏

Average reward adjusted discounted reinforcement learning

作     者:Schneckenreither, Manuel Moser, Georg 

作者机构:Department of Information Systems Production and Logistics Management University of Innsbruck Universitaetsstrasse 15 Innsbruck6020 Austria Computer Science University of Innsbruck Technikerstrasse 21a Innsbruck6020 Austria 

出 版 物:《Neural Computing and Applications》 (Neural Comput. Appl.)

年 卷 期:2025年

页      面:1-32页

核心收录:

学科分类:08[工学] 0835[工学-软件工程] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

主  题:Adversarial machine learning 

摘      要:In this paper, we provide a novel view upon reinforcement learning (RL for short). In particular, we are interested in applications of RL in use cases, where average rewards may be nonzero. While RL methodologies have been extensively researched upon, this particular application area has only received scarce attention in the Literature. In part, our motivation stems from applications in Operation Research (OR for short), where it is typically the case that rewards are profit derived. Similar use cases can be found in more general applications in economics. Based on a principled study of the mathematical background of discounted reinforcement learning we establish a novel adaptation of standard RL, dubbed Average Reward Adjusted Discounted Reinforcement Learning (ARAL for short). Our approach stems from revisiting the Laurent Series expansion of the discounted state value and a subsequent reformulation of the target function guiding the learning process. While the theoretical advance is arguably incremental, we provide ample experimental evidence that the thus obtained novel RL methodology compares favorable to well-established techniques like Q-learning or R-learning. © The Author(s) 2025.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分