咨询与建议

限定检索结果

文献类型

  • 4 篇 期刊文献
  • 3 篇 会议

馆藏范围

  • 7 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 5 篇 工学
    • 5 篇 计算机科学与技术...
    • 2 篇 控制科学与工程
    • 1 篇 电气工程
    • 1 篇 软件工程

主题

  • 7 篇 policy gradient ...
  • 5 篇 reinforcement le...
  • 3 篇 actor-critic alg...
  • 3 篇 markov decision ...
  • 2 篇 conditional valu...
  • 2 篇 chance-constrain...
  • 1 篇 hybrid recommend...
  • 1 篇 reward uncertain...
  • 1 篇 model predictive...
  • 1 篇 risk-sensitive r...
  • 1 篇 chance programmi...
  • 1 篇 fault-tolerant c...
  • 1 篇 safe reinforceme...
  • 1 篇 multi-agent lear...
  • 1 篇 reward shaping
  • 1 篇 machine learning
  • 1 篇 martingale
  • 1 篇 proximal policy ...
  • 1 篇 risk-sensitive
  • 1 篇 doob decompositi...

机构

  • 1 篇 aeronautics and ...
  • 1 篇 kasdi merbah uni...
  • 1 篇 iit bhu varanasi...
  • 1 篇 imt nord europe ...
  • 1 篇 stanford univ de...
  • 1 篇 university of ca...
  • 1 篇 deepmind mountai...
  • 1 篇 quartz lab ensea...
  • 1 篇 insa hauts defra...
  • 1 篇 j.p. morgan ai r...
  • 1 篇 yildiz tech univ...
  • 1 篇 department of st...
  • 1 篇 stanford univ ae...
  • 1 篇 university of wa...
  • 1 篇 univ calif berke...
  • 1 篇 deepmind mountai...

作者

  • 1 篇 eric mazumdar
  • 1 篇 gautam tanmay
  • 1 篇 sayed-mouchaweh ...
  • 1 篇 toubakh houari
  • 1 篇 sojoudi somayeh
  • 1 篇 kafi mohamed red...
  • 1 篇 lakshmanan kaila...
  • 1 篇 prashant reddy
  • 1 篇 s. shankar sastr...
  • 1 篇 djemai mohamed
  • 1 篇 zhou alec
  • 1 篇 michael i. jorda...
  • 1 篇 nelson vadori
  • 1 篇 lillian j. ratli...
  • 1 篇 henna hicham
  • 1 篇 marco pavone
  • 1 篇 chow yinlam
  • 1 篇 ghavamzadeh moha...
  • 1 篇 pavone marco
  • 1 篇 gursoy oemer

语言

  • 7 篇 英文
检索条件"主题词=policy gradient algorithms"
7 条 记 录,以下是1-10 订阅
policy-gradient algorithms Have No Guarantees of Convergence in Linear Quadratic Games  20
Policy-Gradient Algorithms Have No Guarantees of Convergence...
收藏 引用
Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems
作者: Eric Mazumdar Lillian J. Ratliff Michael I. Jordan S. Shankar Sastry University of California Berkeley Berkeley CA USA University of Washington Seattle WA USA
We show by counterexample that policy-gradient algorithms have no guarantees of even local convergence to Nash equilibria in continuous action and state space multi-agent settings. To do so, we analyze gradient-play i... 详细信息
来源: 评论
Satellite fault tolerant attitude control based on expert guided exploration of reinforcement learning agent
收藏 引用
JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE 2024年
作者: Henna, Hicham Toubakh, Houari Kafi, Mohamed Redouane Gursoy, Oemer Sayed-Mouchaweh, Moamar Djemai, Mohamed Kasdi Merbah Univ LAGE Lab Ouargla Algeria Yildiz Tech Univ Istanbul Turkiye IMT Nord Europe CERI SN Douai France INSA Hauts Defrance UPHF INSA Hauts de France Valenciennes France Quartz Lab ENSEA Cergy France
This research provides a method that accelerates learning and avoids local minima to improve the policy gradient algorithm's learning process. Reinforcement learning has the advantage of not requiring a model. Con... 详细信息
来源: 评论
Proximal policy optimization based hybrid recommender systems for large scale recommendations
收藏 引用
MULTIMEDIA TOOLS AND APPLICATIONS 2023年 第13期82卷 20079-20100页
作者: Padhye, Vaibhav Lakshmanan, Kailasam Chaturvedi, Amrita IIT BHU Varanasi BHU Varanasi India
Recommender systems have become increasingly popular due to the significant rise in digital information over the internet in recent users. They help provide personalized recommendations to the user by selecting a few ... 详细信息
来源: 评论
Safe Reinforcement Learning with Chance-constrained Model Predictive Control  4
Safe Reinforcement Learning with Chance-constrained Model Pr...
收藏 引用
4th Annual Conference on Learning for Dynamics and Control (L4DC)
作者: Pfrommer, Samuel Gautam, Tanmay Zhou, Alec Sojoudi, Somayeh Univ Calif Berkeley Dept Elect Engn & Comp Sci Berkeley CA 94720 USA
Real-world reinforcement learning (RL) problems often demand that agents behave safely by obeying a set of designed constraints. We address the challenge of safe RL by coupling a safety guide based on model predictive... 详细信息
来源: 评论
Risk-sensitive reinforcement learning: a martingale approach to reward uncertainty  20
Risk-sensitive reinforcement learning: a martingale approach...
收藏 引用
Proceedings of the First ACM International Conference on AI in Finance
作者: Nelson Vadori Sumitra Ganesh Prashant Reddy Manuela Veloso J.P. Morgan AI Research
We introduce a novel framework to account for sensitivity to rewards uncertainty in sequential decision-making problems. While risk-sensitive formulations for Markov decision processes studied so far focus on the dist... 详细信息
来源: 评论
Risk-Constrained Reinforcement Learning with Percentile Risk Criteria
收藏 引用
JOURNAL OF MACHINE LEARNING RESEARCH 2018年 18卷 1-51页
作者: Chow, Yinlam Ghavamzadeh, Mohammad Janson, Lucas Pavone, Marco DeepMind Mountain View CA 94043 USA Stanford Univ Dept Stat Stanford CA 94305 USA Stanford Univ Aeronaut & Astronaut Stanford CA 94305 USA
In many sequential decision-making problems one is interested in minimizing an expected cumulative cost while taking into account risk, i.e., increased awareness of events of small probability and high consequences. A... 详细信息
来源: 评论
Risk-constrained reinforcement learning with percentile risk criteria
The Journal of Machine Learning Research
收藏 引用
The Journal of Machine Learning Research 2017年 第1期18卷
作者: Yinlam Chow Mohammad Ghavamzadeh Lucas Janson Marco Pavone DeepMind Mountain View CA Department of Statistics Stanford University Stanford CA Aeronautics and Astronautics Stanford University Stanford CA
In many sequential decision-making problems one is interested in minimizing an expected cumulative cost while taking into account risk, i.e., increased awareness of events of small probability and high consequences. A... 详细信息
来源: 评论