咨询与建议

限定检索结果

文献类型

  • 18 篇 期刊文献
  • 8 篇 会议
  • 1 篇 学位论文

馆藏范围

  • 27 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 24 篇 工学
    • 17 篇 计算机科学与技术...
    • 7 篇 信息与通信工程
    • 6 篇 电气工程
    • 6 篇 控制科学与工程
    • 1 篇 交通运输工程
    • 1 篇 软件工程
  • 6 篇 理学
    • 6 篇 数学
    • 1 篇 系统科学
  • 3 篇 管理学
    • 3 篇 管理科学与工程(可...

主题

  • 27 篇 linear function ...
  • 15 篇 reinforcement le...
  • 5 篇 q-learning
  • 4 篇 energy harvestin...
  • 3 篇 planning
  • 3 篇 markov decision ...
  • 3 篇 decode and forwa...
  • 2 篇 ode method
  • 2 篇 finite-sample an...
  • 2 篇 wireless edge ca...
  • 2 篇 massive mimo
  • 2 篇 policy iteration
  • 2 篇 two-hop communic...
  • 2 篇 cross entropy me...
  • 1 篇 td learning fami...
  • 1 篇 vehicle dynamics
  • 1 篇 control problem
  • 1 篇 large file libra...
  • 1 篇 tensor completio...
  • 1 篇 off-policy predi...

机构

  • 2 篇 univ alberta edm...
  • 2 篇 univ rostock ins...
  • 2 篇 deepmind england
  • 1 篇 guangxi univ tec...
  • 1 篇 caltech comp mat...
  • 1 篇 soochow univ sch...
  • 1 篇 univ pompeu fabr...
  • 1 篇 heriot watt univ...
  • 1 篇 georgia inst tec...
  • 1 篇 univ texas austi...
  • 1 篇 univ edinburgh i...
  • 1 篇 univ edinburgh s...
  • 1 篇 georgia inst tec...
  • 1 篇 indian inst sci ...
  • 1 篇 princeton univ p...
  • 1 篇 suzhou univ sci ...
  • 1 篇 duke univ dept e...
  • 1 篇 tech univ darmst...
  • 1 篇 univ edinburgh e...
  • 1 篇 lnm inst informa...

作者

  • 3 篇 garg navneet
  • 3 篇 ratnarajah tharm...
  • 3 篇 maguluri siva th...
  • 2 篇 joseph ajin geor...
  • 2 篇 weber tobias
  • 2 篇 bhatnagar shalab...
  • 2 篇 szepesvari csaba
  • 2 篇 klein anja
  • 2 篇 chen zaiwei
  • 2 篇 clarke john-paul
  • 2 篇 sellathurai math...
  • 2 篇 ortiz andrea
  • 1 篇 fu qiming
  • 1 篇 lu guosheng
  • 1 篇 wookey dean s.
  • 1 篇 chen jianping
  • 1 篇 weisz gellert
  • 1 篇 jordan michael i
  • 1 篇 zhong shan
  • 1 篇 tan jack

语言

  • 27 篇 英文
检索条件"主题词=Linear function approximation"
27 条 记 录,以下是1-10 订阅
Provably Efficient Reinforcement Learning with linear function approximation
收藏 引用
MATHEMATICS OF OPERATIONS RESEARCH 2023年 第3期48卷 1496-1521页
作者: Jin, Chi Yang, Zhuoran Wang, Zhaoran Jordan, Michael, I Princeton Univ Princeton NJ 08544 USA Yale Univ New Haven CT 06520 USA Northwestern Univ Evanston IL 60208 USA Univ Calif Berkeley Berkeley CA 94720 USA
Modern reinforcement learning (RL) is commonly applied to practical problems with an enormous number of states, where function approximation must be deployed to approximate either the value function or the policy. The... 详细信息
来源: 评论
Adaptive Temporal Difference Learning With linear function approximation
收藏 引用
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022年 第12期44卷 8812-8824页
作者: Sun, Tao Shen, Han Chen, Tianyi Li, Dongsheng Natl Univ Def Technol Coll Comp Changsha 410073 Hunan Peoples R China Rensselaer Polytech Inst Dept ECSE Troy NY 12180 USA
This paper revisits the temporal difference (TD) learning algorithm for the policy evaluation tasks in reinforcement learning. Typically, the performance of TD(0) and TD(lambda) is very sensitive to the choice of step... 详细信息
来源: 评论
Finite-Sample Analysis of Off-Policy Natural Actor-Critic With linear function approximation
收藏 引用
IEEE CONTROL SYSTEMS LETTERS 2022年 6卷 2611-2616页
作者: Chen, Zaiwei Khodadadian, Sajad Maguluri, Siva Theja Georgia Inst Technol Sch Ind & Syst Engn Atlanta GA 30332 USA
In this letter, we develop a novel variant of natural actor-critic algorithm using off-policy sampling and linear function approximation, and establish a sample complexity of O(epsilon(-3)), outperforming all the prev... 详细信息
来源: 评论
LinFa-Q: Accurate Q-learning with linear function approximation
收藏 引用
NEUROCOMPUTING 2025年 611卷
作者: Wang, Zhechao Fu, Qiming Chen, Jianping Liu, Quan Lu, You Wu, Hongjie Hu, Fuyuan Suzhou Univ Sci & Technol Sch Elect & Informat Engn Suzhou 215009 Peoples R China Suzhou Univ Sci & Technol Jiangsu Prov Engn Res Ctr Construct Carbon Neutral Suzhou 215009 Peoples R China Suzhou Univ Sci & Technol Jiangsu Prov Key Lab Intelligent Bldg Energy Effic Suzhou 215009 Peoples R China Suzhou Univ Sci & Technol Sch Architecture & Urban Planning Suzhou 215009 Peoples R China Soochow Univ Sch Comp Sci & Technol Suzhou 215006 Jiangsu Peoples R China
Although Q-learning has achieved remarkable success in some practical cases, it often suffers from the overestimation problem in stochastic environments, which is commonly viewed as a shortcoming of Q-learning. Overes... 详细信息
来源: 评论
Efficient Local Planning with linear function approximation  33
Efficient Local Planning with Linear Function Approximation
收藏 引用
33rd International Conference on Algorithmic Learning Theory (ALT)
作者: Yin, Dong Hao, Botao Abbasi-Yadkori, Yasin Lazic, Nevena Szepesvari, Csaba DeepMind London England Univ Alberta Edmonton AB Canada
We study query and computationally efficient planning algorithms for discounted Markov decision processes (MDPs) with linear function approximation and a simulator. The agent is assumed to have local access to the sim... 详细信息
来源: 评论
Reinforcement learning vs. rule-based adaptive traffic signal control: A Fourier basis linear function approximation for traffic signal control
收藏 引用
AI COMMUNICATIONS 2021年 第1期34卷 89-103页
作者: Ziemke, Theresa Alegre, Lucas N. Bazzan, Ana L. C. Tech Universitdt Berlin Transport Syst Planning & Transport Telemat Berlin Germany Univ Fed Rio Grande do Sul UFRGS Inst Informat Porto Alegre RS Brazil
Reinforcement learning is an efficient, widely used machine learning technique that performs well when the state and action spaces have a reasonable size. This is rarely the case regarding control-related problems, as... 详细信息
来源: 评论
An online prediction algorithm for reinforcement learning with linear function approximation using cross entropy method
收藏 引用
MACHINE LEARNING 2018年 第8-10期107卷 1385-1429页
作者: Joseph, Ajin George Bhatnagar, Shalabh Univ Alberta Dept Comp Sci Edmonton AB Canada Indian Inst Sci Dept Comp Sci & Automat Bangalore Karnataka India Indian Inst Sci Robert Bosch Ctr Cyber Phys Syst Bangalore Karnataka India
In this paper, we provide two new stable online algorithms for the problem of prediction in reinforcement learning, i.e., estimating the value function of a model-free Markov reward process using the linear function a... 详细信息
来源: 评论
Multi-Agent Federated Q-Learning Algorithms for Wireless Edge Caching
收藏 引用
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY 2025年 第2期74卷 2973-2988页
作者: Liu, Zhikai Garg, Navneet Ratnarajah, Tharmalingam Univ Edinburgh Inst Imaging Data & Commun Sch Engn Edinburgh EH8 9YL Scotland Univ Edinburgh Inst Data Image & Commun Sch Engn Edinburgh EH8 9YL Scotland LNM Inst Informat Technol Dept Elect & Commun Engn Jaipur 302004 India
Edge caching is an increasingly vital technique in wireless networks, particularly needed to address users' repeated demands, including real-time traffic data and map access in vehicular communications. This paper... 详细信息
来源: 评论
Target Network and Truncation Overcome the Deadly Triad in Q-Learning
收藏 引用
SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE 2023年 第4期5卷 1078-1101页
作者: Chend, Zaiwei Clarke, John-Paul Maguluri, Siva Theja CALTECH Comp Math Sci Pasadena CA 91106 USA Univ Texas Austin Aerosp Engn & Engn Mech Austin TX 78712 USA Georgia Inst Technol Ind Syst Engn Atlanta GA 30332 USA
Q-learning with function approximation is one of the most empirically successful while theoretically mysterious reinforcement learning (RL) algorithms and was identified in [R. S. Sutton, in European Conference on Com... 详细信息
来源: 评论
Finite-sample analysis of nonlinear stochastic approximation with applications in reinforcement learning
收藏 引用
AUTOMATICA 2022年 146卷
作者: Chen, Zaiwei Zhang, Sheng Doan, Thinh T. Clarke, John-Paul Maguluri, Siva Theja Georgia Inst Technol Atlanta GA 30332 USA Virginia Tech Blacksburg VA 24061 USA Univ Texas Austin Austin TX 78712 USA
Motivated by applications in reinforcement learning (RL), we study a nonlinear stochastic approximation (SA) algorithm under Markovian noise, and establish its finite-sample convergence bounds under various stepsizes.... 详细信息
来源: 评论