版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Department of Information Systems Production and Logistics Management University of Innsbruck Universitaetsstrasse 15 Innsbruck6020 Austria Computer Science University of Innsbruck Technikerstrasse 21a Innsbruck6020 Austria
出 版 物:《Neural Computing and Applications》 (Neural Comput. Appl.)
年 卷 期:2025年
页 面:1-32页
核心收录:
学科分类:08[工学] 0835[工学-软件工程] 0812[工学-计算机科学与技术(可授工学、理学学位)]
主 题:Adversarial machine learning
摘 要:In this paper, we provide a novel view upon reinforcement learning (RL for short). In particular, we are interested in applications of RL in use cases, where average rewards may be nonzero. While RL methodologies have been extensively researched upon, this particular application area has only received scarce attention in the Literature. In part, our motivation stems from applications in Operation Research (OR for short), where it is typically the case that rewards are profit derived. Similar use cases can be found in more general applications in economics. Based on a principled study of the mathematical background of discounted reinforcement learning we establish a novel adaptation of standard RL, dubbed Average Reward Adjusted Discounted Reinforcement Learning (ARAL for short). Our approach stems from revisiting the Laurent Series expansion of the discounted state value and a subsequent reformulation of the target function guiding the learning process. While the theoretical advance is arguably incremental, we provide ample experimental evidence that the thus obtained novel RL methodology compares favorable to well-established techniques like Q-learning or R-learning. © The Author(s) 2025.