咨询与建议

限定检索结果

文献类型

  • 52 篇 期刊文献
  • 24 篇 会议
  • 1 篇 学位论文

馆藏范围

  • 77 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 72 篇 工学
    • 41 篇 计算机科学与技术...
    • 30 篇 电气工程
    • 22 篇 控制科学与工程
    • 18 篇 信息与通信工程
    • 6 篇 交通运输工程
    • 5 篇 软件工程
    • 3 篇 机械工程
    • 3 篇 仪器科学与技术
    • 2 篇 测绘科学与技术
    • 1 篇 化学工程与技术
  • 18 篇 管理学
    • 14 篇 管理科学与工程(可...
    • 2 篇 工商管理
    • 2 篇 图书情报与档案管...
  • 12 篇 理学
    • 10 篇 数学
    • 6 篇 系统科学
    • 1 篇 地球物理学
    • 1 篇 统计学(可授理学、...
  • 3 篇 医学
    • 3 篇 临床医学
    • 1 篇 基础医学(可授医学...
  • 1 篇 经济学
    • 1 篇 理论经济学
    • 1 篇 应用经济学
  • 1 篇 教育学
    • 1 篇 教育学

主题

  • 77 篇 actor-critic alg...
  • 31 篇 reinforcement le...
  • 17 篇 deep reinforceme...
  • 5 篇 deep learning
  • 5 篇 reinforcement le...
  • 4 篇 input constraint...
  • 3 篇 dynamic path pla...
  • 3 篇 task analysis
  • 3 篇 transfer learnin...
  • 3 篇 differential gam...
  • 3 篇 reinforcement le...
  • 3 篇 trajectory
  • 3 篇 active hypothesi...
  • 3 篇 sequential sensi...
  • 3 篇 multi-agent rein...
  • 2 篇 vehicle dynamics
  • 2 篇 quickest state e...
  • 2 篇 sample efficienc...
  • 2 篇 nonzero-sum stoc...
  • 2 篇 industry 4.0

机构

  • 4 篇 syracuse univ de...
  • 4 篇 indian inst sci ...
  • 3 篇 menoufia univ fa...
  • 3 篇 school of contro...
  • 2 篇 lebanese amer un...
  • 2 篇 concordia univ m...
  • 2 篇 harokopio univ a...
  • 2 篇 huazhong univ sc...
  • 2 篇 lakehead univ th...
  • 2 篇 nile univ sesc r...
  • 2 篇 jilin univ key l...
  • 2 篇 univ elect sci &...
  • 2 篇 shenzhen univ co...
  • 2 篇 texas a&m univ d...
  • 2 篇 jilin univ coll ...
  • 1 篇 univ calif berke...
  • 1 篇 zhongguancun lab...
  • 1 篇 univ texas austi...
  • 1 篇 hanoi univ sci &...
  • 1 篇 texas a&m univ c...

作者

  • 3 篇 joseph geethu
  • 3 篇 chronis christos
  • 3 篇 shalaby raafat
  • 3 篇 bhatnagar shalab...
  • 3 篇 varlamis iraklis
  • 3 篇 varshney pramod ...
  • 3 篇 politi elena
  • 3 篇 gursoy m. cenk
  • 3 篇 mahmoud tarek a.
  • 3 篇 abo-zalam belal
  • 3 篇 dimitrakopoulos ...
  • 3 篇 el-hossainy moha...
  • 2 篇 wang bing-chang
  • 2 篇 assi chadi
  • 2 篇 wang yanzhi
  • 2 篇 zhang zhicai
  • 2 篇 lu shuai
  • 2 篇 qiu qinru
  • 2 篇 qu hong
  • 2 篇 parizs richard d...

语言

  • 77 篇 英文
检索条件"主题词=Actor-critic algorithm"
77 条 记 录,以下是41-50 订阅
排序:
Age of Information Aware Trajectory Planning of UAVs in Intelligent Transportation Systems: A Deep Learning Approach
收藏 引用
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY 2020年 第11期69卷 12382-12395页
作者: Samir, Moataz Assi, Chadi Sharafeddine, Sanaa Ebrahimi, Dariush Ghrayeb, Ali Concordia Univ Montreal PQ H3G 1M8 Canada Lebanese Amer Univ Beirut 11022801 Lebanon Lakehead Univ Thunder Bay ON P7B 5E1 Canada Texas A&M Univ Doha 23874 Qatar
Unmanned aerial vehicles (UAVs) are envisioned to play a key role in intelligent transportation systems to complement the communication infrastructure in future smart cities. UAV-assisted vehicular networking research... 详细信息
来源: 评论
actor-critic Based Graphical Games for Discrete-time Linear Systems with Input Constraints
Actor-critic Based Graphical Games for Discrete-time Linear ...
收藏 引用
第三十九届中国控制会议
作者: Tian-Xiang Wang Yong Liang Bing-Chang Wang School of Control Science and Engineering Shandong University
In dynamic graphical games, in order to obtain the optimal strategy for each agent, the traditional method is to solve a set of coupled HJB equations. It is very difficult to solve such problems by traditional methods... 详细信息
来源: 评论
Manipulator Motion Planning based on actor-critic Reinforcement Learning
Manipulator Motion Planning based on Actor-Critic Reinforcem...
收藏 引用
第40届中国控制会议
作者: Qiang Li Jun Nie Haixia Wang Xiao Lu Shibin Song College of Electrical Engineering and Automation Shandong University of Science and Technology
The manipulator control model has the characteristics of high-order,nonlinear,multivariable and strong coupling,which makes it difficult for the manipulator to have good adaptability and *** at the problem of poor reu... 详细信息
来源: 评论
Leveraging UAVs for Coverage in Cell-Free Vehicular Networks: A Deep Reinforcement Learning Approach
收藏 引用
IEEE TRANSACTIONS ON MOBILE COMPUTING 2021年 第9期20卷 2835-2847页
作者: Samir, Moataz Ebrahimi, Dariush Assi, Chadi Sharafeddine, Sanaa Ghrayeb, Ali Concordia Univ Montreal PQ H3G 1M8 Canada Lakehead Univ Thunder Bay ON P7B 5E1 Canada Lebanese Amer Univ Beirut 11022801 Lebanon Texas A&M Univ Doha 23874 Qatar
The success in transitioning towards smart cities relies on the availability of information and communication technologies that meet the demands of this transformation. The terrestrial infrastructure presents itself a... 详细信息
来源: 评论
Formation control scheme with reinforcement learning strategy for a group of multiple surface vehicles
收藏 引用
INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL 2024年 第3期34卷 2252-2279页
作者: Nguyen, Khai Dang, Van Trong Pham, Dinh Duong Dao, Phuong Nam Carnegie Mellon Univ Dept Mech Engn Pittsburgh PA USA Hanoi Univ Sci & Technol Sch Elect & Elect Engn Hanoi Vietnam
This article presents a comprehensive approach to integrate formation tracking control and optimal control for a fleet of multiple surface vehicles (SVs), accounting for both kinematic and dynamic models of each SV ag... 详细信息
来源: 评论
Uncertainty modified policy for multi-agent reinforcement learning
收藏 引用
APPLIED INTELLIGENCE 2024年 第22期54卷 12020-12034页
作者: Zhao, Xinyu Liu, Jianxiang Wu, Faguo Zhang, Xiao Wang, Guojian Beihang Univ Sch Math Sci 37 Xueyuan Rd Beijing 100190 Peoples R China Beihang Univ Inst Artificial Intelligence 37 Xueyuan Rd Beijing 100190 Peoples R China Beihang Univ Key Lab Math Informat & Behav Semant LMIB Beijing 100191 Peoples R China Zhongguancun Lab Beijing 100194 Peoples R China Beihang Univ Bejing Adv Innovat Ctr Future Blockchain & Privacy Beijing 100191 Peoples R China
Uncertainty in the evolution of opponent behavior creates a non-stationary environment for the agent, reducing the reliability of value estimation and strategy selection while compromising security during the explorat... 详细信息
来源: 评论
Fractional-order fuzzy sliding mode control of uncertain nonlinear MIMO systems using fractional-order reinforcement learning
收藏 引用
COMPLEX & INTELLIGENT SYSTEMS 2024年 第2期10卷 3057-3085页
作者: Mahmoud, Tarek A. El-Hossainy, Mohammad Abo-Zalam, Belal Shalaby, Raafat Menoufia Univ Fac Elect Engn Dept Ind Elect & Control Engn Menoufia 32952 Egypt Nile Univ SESC Res Ctr Sch Engn & Appl Sci MECT Program Giza 12588 Egypt
This paper introduces a novel approach aimed at enhancing the control performance of a specific class of unknown multiple-input and multiple-output nonlinear systems. The proposed method involves the utilization of a ... 详细信息
来源: 评论
Reinforcement learning with dynamic convex risk measures
收藏 引用
MATHEMATICAL FINANCE 2024年 第2期34卷 557-587页
作者: Coache, Anthony Jaimungal, Sebastian Univ Toronto Dept Stat Sci Toronto ON Canada Univ Oxford Oxford Man Inst Oxford England
We develop an approach for solving time-consistent risk-sensitive stochastic optimization problems using model-free reinforcement learning (RL). Specifically, we assume agents assess the risk of a sequence of random v... 详细信息
来源: 评论
Scalable and Decentralized algorithms for Anomaly Detection via Learning-Based Controlled Sensing
收藏 引用
IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS 2023年 9卷 640-654页
作者: Joseph, Geethu Zhong, Chen Gursoy, M. Cenk Velipasalar, Senem Varshney, Pramod K. Delft Univ Technol Signal Proc Syst Grp NL-2628 Delft Netherlands Syracuse Univ Dept Elect Engn & Comp Sci Syracuse NY 13244 USA
We address the problem of sequentially selecting and observing processes from a given set to find the anomalies among them. The decision-maker observes a subset of the processes at any given time instant and obtains a... 详细信息
来源: 评论
Conditionally Elicitable Dynamic Risk Measures for Deep Reinforcement Learning
收藏 引用
SIAM JOURNAL ON FINANCIAL MATHEMATICS 2023年 第4期14卷 1249-1289页
作者: Coache, Anthony Jaimungal, Sebastian Cartea, Alvaro Univ Toronto Dept Stat Sci Toronto ON M5G 1Z5 Canada Oxford Man Inst Quantitat Finance Oxford OX2 6ED England Univ Oxford Oxford Man Inst Quantitat Finance Oxford OX2 9HB England Univ Oxford Math Inst Oxford OX2 9HB England
We propose a novel framework to solve risk-sensitive reinforcement learning problems where the agent optimizes time-consistent dynamic spectral risk measures. Based on the notion of conditional elicitability, our meth... 详细信息
来源: 评论