咨询与建议

限定检索结果

文献类型

  • 31 篇 期刊文献
  • 9 篇 会议
  • 1 篇 学位论文

馆藏范围

  • 41 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 30 篇 工学
    • 16 篇 控制科学与工程
    • 16 篇 计算机科学与技术...
    • 7 篇 电气工程
    • 3 篇 信息与通信工程
    • 1 篇 机械工程
    • 1 篇 动力工程及工程热...
    • 1 篇 化学工程与技术
    • 1 篇 石油与天然气工程
    • 1 篇 软件工程
  • 9 篇 理学
    • 8 篇 数学
    • 1 篇 系统科学
    • 1 篇 统计学(可授理学、...
  • 5 篇 管理学
    • 5 篇 管理科学与工程(可...
  • 2 篇 经济学
    • 1 篇 理论经济学
    • 1 篇 应用经济学
  • 1 篇 军事学

主题

  • 41 篇 actor-critic alg...
  • 21 篇 reinforcement le...
  • 9 篇 markov decision ...
  • 5 篇 stochastic appro...
  • 4 篇 martingale
  • 4 篇 two timescale st...
  • 4 篇 policy gradient
  • 3 篇 risk-sensitive r...
  • 3 篇 normalized hadam...
  • 3 篇 markov decision ...
  • 3 篇 policy gradient ...
  • 3 篇 deep reinforceme...
  • 2 篇 continuous time ...
  • 2 篇 simultaneous per...
  • 2 篇 function approxi...
  • 2 篇 policy evaluatio...
  • 2 篇 nonholonomic mob...
  • 2 篇 mixed multi-agen...
  • 2 篇 conditional valu...
  • 2 篇 chance-constrain...

机构

  • 5 篇 indian inst sci ...
  • 3 篇 tata inst fundam...
  • 2 篇 mit informat & d...
  • 2 篇 boston univ div ...
  • 2 篇 syracuse univ de...
  • 2 篇 ibm research ban...
  • 2 篇 inria lille
  • 2 篇 boston univ ctr ...
  • 1 篇 inria
  • 1 篇 amazon-iisc post...
  • 1 篇 aeronautics and ...
  • 1 篇 fime
  • 1 篇 george washingto...
  • 1 篇 norwegian univ s...
  • 1 篇 boston univ dept...
  • 1 篇 univ paris cite
  • 1 篇 indian inst tech...
  • 1 篇 sun microsyst la...
  • 1 篇 univ ottawa dept...
  • 1 篇 edf r&d fime

作者

  • 4 篇 bhatnagar shalab...
  • 3 篇 abdulla mohammed...
  • 3 篇 ghavamzadeh moha...
  • 3 篇 borkar vs
  • 2 篇 wang jing
  • 2 篇 d. sai koti redd...
  • 2 篇 konda vr
  • 2 篇 velipasalar sene...
  • 2 篇 gursoy m. cenk
  • 2 篇 shalabh bhatnaga...
  • 2 篇 zhong chen
  • 2 篇 paschalidis ioan...
  • 2 篇 pham huyen
  • 2 篇 warin xavier
  • 2 篇 mohammad ghavamz...
  • 2 篇 paschalidis ioan...
  • 1 篇 srikanth g. tami...
  • 1 篇 saha amrita
  • 1 篇 kumar s
  • 1 篇 bernhard schölko...

语言

  • 37 篇 英文
  • 4 篇 其他
检索条件"主题词=Actor-Critic algorithms"
41 条 记 录,以下是31-40 订阅
排序:
Has Dynamic Programming Improved Decision Making?
收藏 引用
ANNUAL REVIEW OF ECONOMICS, VOL 11, 2019 2019年 第1期11卷 833-858页
作者: Rust, John Georgetown Univ Dept Econ Washington DC 20057 USA
Dynamic programming (DP) is a powerful tool for solving a wide class of sequential decision-making problems under uncertainty. In principle, it enables us to compute optimal decision rules that specify the best possib... 详细信息
来源: 评论
Risk Averse Reinforcement Learning for Mixed Multi-Agent Environments  18
Risk Averse Reinforcement Learning for Mixed Multi-Agent Env...
收藏 引用
18th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS)
作者: Reddy, D. Sai Koti Saha, Amrita Tamilselvam, Srikanth G. Agrawal, Priyanka Dayama, Pankaj IBM Res Yorktown Hts NY 10598 USA
Most real world applications of multi-agent systems, need to keep a balance between maximizing the rewards and minimizing the risks. In this work we consider a popular risk measure, variance of return (VOR), as a cons... 详细信息
来源: 评论
Deep Reinforcement Learning for Market Making in Corporate Bonds: Beating the Curse of Dimensionality
收藏 引用
Applied Mathematical Finance 2019年 第5期26卷 387-452页
作者: Guéant, Olivier Manziuk, Iuliia Centre d’Economie de la Sorbonne Université Paris 1 Panthéon-Sorbonne Paris France
In corporate bond markets, which are mainly OTC markets, market makers play a central role by providing bid and ask prices for bonds to asset managers. Determining the optimal bid and ask quotes that a market maker sh... 详细信息
来源: 评论
SMART BUILDING REAL TIME PRICING FOR OFFERING LOAD-SIDE REGULATION SERVICE RESERVES
SMART BUILDING REAL TIME PRICING FOR OFFERING LOAD-SIDE REGU...
收藏 引用
52nd IEEE Annual Conference on Decision and Control (CDC)
作者: Bilgin, Enes Caramanis, Michael C. Paschalidis, Ioannis Ch. Boston Univ Coll Engn Ctr Informat & Syst Engn Boston MA 02215 USA
Provision of Regulation Service (RS) reserves to Power Markets by smart building demand response has attracted attention in recent literature. This paper develops tractable dynamic optimal pricing algorithms for distr... 详细信息
来源: 评论
REINFORCEMENT LEARNING CONTROL FOR AUTOROTATION OF A SIMPLE POINT-MASS HELICOPTER MODEL
REINFORCEMENT LEARNING CONTROL FOR AUTOROTATION OF A SIMPLE ...
收藏 引用
作者: KADIRCAN KOPSA MIDDLE EAST TECHNICAL UNIVERSITY
学位级别:硕士
This study presents an application of an actor-critic reinforcement learning method to a simple point-mass model helicopter guidance problem during autorotation. A point-mass model of an OH-58A helicopter in autorotat... 详细信息
来源: 评论
Experimental evaluation of model-free reinforcement learning algorithms for continuous HVAC control
收藏 引用
APPLIED ENERGY 2021年 298卷 117164-117164页
作者: Biemann, Marco Scheller, Fabian Liu, Xiufeng Huang, Lizhen Tech Univ Denmark Dept Technol Management & Econ DK-2800 Lyngby Denmark Norwegian Univ Sci & Technol Dept Mfg & Civil Engn N-2815 Gjovik Norway
Controlling heating, ventilation and air-conditioning (HVAC) systems is crucial to improving demand-side energy efficiency. At the same time, the thermodynamics of buildings and uncertainties regarding human activitie... 详细信息
来源: 评论
A Reinforcement Learning Based Algorithm for Finite Horizon Markov Decision Processes
A Reinforcement Learning Based Algorithm for Finite Horizon ...
收藏 引用
Proceedings of the 45th IEEE Conference on Decision and Control, Volume 1 of 14
作者: Shalabh Bhatnagar Mohammed Shahid Abdulla The Institute of Electrical and Electronics Engineers,Inc. Department of Computer Science and Automation Indian Institute of Science Bangalore INDIA
We develop a simulation based algorithm for finite horizon Markov decision processes with finite state and finite action space. Illustrative numerical experiments with the proposed algorithm are shown for problems in ... 详细信息
来源: 评论
Risk-sensitive reinforcement learning: a martingale approach to reward uncertainty  20
Risk-sensitive reinforcement learning: a martingale approach...
收藏 引用
Proceedings of the First ACM International Conference on AI in Finance
作者: Nelson Vadori Sumitra Ganesh Prashant Reddy Manuela Veloso J.P. Morgan AI Research
We introduce a novel framework to account for sensitivity to rewards uncertainty in sequential decision-making problems. While risk-sensitive formulations for Markov decision processes studied so far focus on the dist... 详细信息
来源: 评论
Risk Averse Reinforcement Learning for Mixed Multi-agent Environments  19
Risk Averse Reinforcement Learning for Mixed Multi-agent Env...
收藏 引用
Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems
作者: D. Sai Koti Reddy Amrita Saha Srikanth G. Tamilselvam Priyanka Agrawal Pankaj Dayama IBM Research Bangalore India
Most real world applications of multi-agent systems, need to keep a balance between maximizing the rewards and minimizing the risks. In this work we consider a popular risk measure, variance of return (VOR), as a cons... 详细信息
来源: 评论
Multi-Objective Reinforcement Learning with Non-Linear Scalarization  22
Multi-Objective Reinforcement Learning with Non-Linear Scala...
收藏 引用
Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems
作者: Mridul Agarwal Vaneet Aggarwal Tian Lan Purdue University West Lafayette IN USA George Washington University Washington DC USA
Multi-Objective Reinforcement Learning (MORL) setup naturally arises in many places where an agent optimizes multiple objectives. We consider the problem of MORL where multiple objectives are combined using a non-line... 详细信息
来源: 评论