咨询与建议

限定检索结果

文献类型

  • 53 篇 期刊文献
  • 24 篇 会议
  • 1 篇 学位论文

馆藏范围

  • 78 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 73 篇 工学
    • 41 篇 计算机科学与技术...
    • 31 篇 电气工程
    • 22 篇 控制科学与工程
    • 18 篇 信息与通信工程
    • 7 篇 交通运输工程
    • 5 篇 软件工程
    • 3 篇 机械工程
    • 3 篇 仪器科学与技术
    • 2 篇 测绘科学与技术
    • 1 篇 土木工程
    • 1 篇 化学工程与技术
  • 18 篇 管理学
    • 14 篇 管理科学与工程(可...
    • 2 篇 工商管理
    • 2 篇 图书情报与档案管...
  • 12 篇 理学
    • 10 篇 数学
    • 6 篇 系统科学
    • 1 篇 地球物理学
    • 1 篇 统计学(可授理学、...
  • 3 篇 医学
    • 3 篇 临床医学
    • 1 篇 基础医学(可授医学...
  • 1 篇 经济学
    • 1 篇 理论经济学
    • 1 篇 应用经济学
  • 1 篇 教育学
    • 1 篇 教育学

主题

  • 78 篇 actor-critic alg...
  • 31 篇 reinforcement le...
  • 18 篇 deep reinforceme...
  • 5 篇 deep learning
  • 5 篇 reinforcement le...
  • 4 篇 input constraint...
  • 3 篇 dynamic path pla...
  • 3 篇 task analysis
  • 3 篇 transfer learnin...
  • 3 篇 differential gam...
  • 3 篇 trajectory
  • 3 篇 active hypothesi...
  • 3 篇 sequential sensi...
  • 3 篇 multi-agent rein...
  • 2 篇 vehicle dynamics
  • 2 篇 quickest state e...
  • 2 篇 sample efficienc...
  • 2 篇 nonzero-sum stoc...
  • 2 篇 transformer
  • 2 篇 industry 4.0

机构

  • 4 篇 syracuse univ de...
  • 4 篇 indian inst sci ...
  • 3 篇 menoufia univ fa...
  • 3 篇 school of contro...
  • 2 篇 lebanese amer un...
  • 2 篇 concordia univ m...
  • 2 篇 harokopio univ a...
  • 2 篇 huazhong univ sc...
  • 2 篇 lakehead univ th...
  • 2 篇 nile univ sesc r...
  • 2 篇 jilin univ key l...
  • 2 篇 univ elect sci &...
  • 2 篇 shenzhen univ co...
  • 2 篇 texas a&m univ d...
  • 2 篇 jilin univ coll ...
  • 1 篇 univ calif berke...
  • 1 篇 zhongguancun lab...
  • 1 篇 univ texas austi...
  • 1 篇 hanoi univ sci &...
  • 1 篇 texas a&m univ c...

作者

  • 3 篇 joseph geethu
  • 3 篇 chronis christos
  • 3 篇 shalaby raafat
  • 3 篇 bhatnagar shalab...
  • 3 篇 varlamis iraklis
  • 3 篇 varshney pramod ...
  • 3 篇 politi elena
  • 3 篇 gursoy m. cenk
  • 3 篇 mahmoud tarek a.
  • 3 篇 abo-zalam belal
  • 3 篇 dimitrakopoulos ...
  • 3 篇 el-hossainy moha...
  • 2 篇 wang bing-chang
  • 2 篇 assi chadi
  • 2 篇 wang yanzhi
  • 2 篇 zhang zhicai
  • 2 篇 lu shuai
  • 2 篇 qiu qinru
  • 2 篇 qu hong
  • 2 篇 parizs richard d...

语言

  • 78 篇 英文
检索条件"主题词=Actor-Critic algorithm"
78 条 记 录,以下是21-30 订阅
Graph attention, learning 2-opt algorithm for the traveling salesman problem
收藏 引用
COMPLEX & INTELLIGENT SYSTEMS 2025年 第1期11卷 1-21页
作者: Luo, Jia Heng, Herui Wu, Geng Ningbo Univ Technol Sch Econ & Management Ningbo 315211 Peoples R China Shanghai Maritime Univ Inst Logist Sci & Engn Shanghai 201306 Peoples R China
In recent years, deep graph neural networks (GNNs) have been used as solvers or helper functions for the traveling salesman problem (TSP), but they are usually used as encoders to generate static node representations ... 详细信息
来源: 评论
Optimizing Non-Terrestrial Hybrid RF/FSO Links With Reinforcement Learning: Navigating Through Clouds
IEEE OPEN JOURNAL OF THE COMMUNICATIONS SOCIETY
收藏 引用
IEEE OPEN JOURNAL OF THE COMMUNICATIONS SOCIETY 2025年 6卷 793-806页
作者: Almohamad, Abdullateef Ibrahim, Mostafa Ekin, Sabit Hasna, Mazen Althunibat, Saud Qaraqe, Khalid Texas A&M Univ Coll Stn Dept Elect & Comp Engn College Stn TX 77843 USA Texas A&M Univ Coll Stn Dept Engn Technol & Ind Distribut College Stn TX 77843 USA Qatar Univ Elect Engn Dept Doha Qatar Al Hussein Bin Talal Univ Dept Commun Engn Maan Jordan Hamad Bin Khalifa Univ Coll Sci & Engn Doha Qatar
In the pursuit of ubiquitous broadband connectivity, there has been a significant shift towards the vertical expansion of communication networks into space, particularly through the exploitation of low Earth orbit (LE... 详细信息
来源: 评论
Formation control scheme with reinforcement learning strategy for a group of multiple surface vehicles
收藏 引用
INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL 2024年 第3期34卷 2252-2279页
作者: Nguyen, Khai Dang, Van Trong Pham, Dinh Duong Dao, Phuong Nam Carnegie Mellon Univ Dept Mech Engn Pittsburgh PA USA Hanoi Univ Sci & Technol Sch Elect & Elect Engn Hanoi Vietnam
This article presents a comprehensive approach to integrate formation tracking control and optimal control for a fleet of multiple surface vehicles (SVs), accounting for both kinematic and dynamic models of each SV ag... 详细信息
来源: 评论
Uncertainty modified policy for multi-agent reinforcement learning
收藏 引用
APPLIED INTELLIGENCE 2024年 第22期54卷 12020-12034页
作者: Zhao, Xinyu Liu, Jianxiang Wu, Faguo Zhang, Xiao Wang, Guojian Beihang Univ Sch Math Sci 37 Xueyuan Rd Beijing 100190 Peoples R China Beihang Univ Inst Artificial Intelligence 37 Xueyuan Rd Beijing 100190 Peoples R China Beihang Univ Key Lab Math Informat & Behav Semant LMIB Beijing 100191 Peoples R China Zhongguancun Lab Beijing 100194 Peoples R China Beihang Univ Bejing Adv Innovat Ctr Future Blockchain & Privacy Beijing 100191 Peoples R China
Uncertainty in the evolution of opponent behavior creates a non-stationary environment for the agent, reducing the reliability of value estimation and strategy selection while compromising security during the explorat... 详细信息
来源: 评论
Fractional-order fuzzy sliding mode control of uncertain nonlinear MIMO systems using fractional-order reinforcement learning
收藏 引用
COMPLEX & INTELLIGENT SYSTEMS 2024年 第2期10卷 3057-3085页
作者: Mahmoud, Tarek A. El-Hossainy, Mohammad Abo-Zalam, Belal Shalaby, Raafat Menoufia Univ Fac Elect Engn Dept Ind Elect & Control Engn Menoufia 32952 Egypt Nile Univ SESC Res Ctr Sch Engn & Appl Sci MECT Program Giza 12588 Egypt
This paper introduces a novel approach aimed at enhancing the control performance of a specific class of unknown multiple-input and multiple-output nonlinear systems. The proposed method involves the utilization of a ... 详细信息
来源: 评论
Reinforcement learning with dynamic convex risk measures
收藏 引用
MATHEMATICAL FINANCE 2024年 第2期34卷 557-587页
作者: Coache, Anthony Jaimungal, Sebastian Univ Toronto Dept Stat Sci Toronto ON Canada Univ Oxford Oxford Man Inst Oxford England
We develop an approach for solving time-consistent risk-sensitive stochastic optimization problems using model-free reinforcement learning (RL). Specifically, we assume agents assess the risk of a sequence of random v... 详细信息
来源: 评论
Dynamic Navigation in Unconstrained Environments Using Reinforcement Learning algorithms
收藏 引用
IEEE ACCESS 2023年 11卷 117984-118001页
作者: Chronis, Christos Anagnostopoulos, Georgios Politi, Elena Dimitrakopoulos, George Varlamis, Iraklis Harokopio Univ Athens Dept Informat & Telemat Athens 17779 Greece
The potential for the use of drones in logistics and transportation is continuously growing, with multiple applications both in urban and rural environments. The safe navigation of drones in such environments is a maj... 详细信息
来源: 评论
A Maximum Divergence Approach to Optimal Policy in Deep Reinforcement Learning
收藏 引用
IEEE TRANSACTIONS ON CYBERNETICS 2023年 第3期53卷 1499-1510页
作者: Yang, Zhiyou Qu, Hong Fu, Mingsheng Hu, Wang Zhao, Yongze Univ Elect Sci & Technol China Sch Comp Sci & Engn Chengdu 610054 Peoples R China
Model-free reinforcement learning algorithms based on entropy regularized have achieved good performance in control tasks. Those algorithms consider using the entropy-regularized term for the policy to learn a stochas... 详细信息
来源: 评论
Simultaneous locomotion and manipulation control of quadruped robots using reinforcement learning-based adaptive fractional-order sliding-mode control
收藏 引用
TRANSACTIONS OF THE INSTITUTE OF MEASUREMENT AND CONTROL 2023年 第13期45卷 2459-2476页
作者: Farid, Yousef Tarbiat Modares Univ Sch Elect & Comp Engn POB 14115-111 Tehran Iran
This paper investigates a model-free reinforcement learning-based approach that enables the quadruped robot to manipulate objects while maintaining its balance and dynamic stability during walking. At first, the dynam... 详细信息
来源: 评论
Scalable and Decentralized algorithms for Anomaly Detection via Learning-Based Controlled Sensing
收藏 引用
IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS 2023年 9卷 640-654页
作者: Joseph, Geethu Zhong, Chen Gursoy, M. Cenk Velipasalar, Senem Varshney, Pramod K. Delft Univ Technol Signal Proc Syst Grp NL-2628 Delft Netherlands Syracuse Univ Dept Elect Engn & Comp Sci Syracuse NY 13244 USA
We address the problem of sequentially selecting and observing processes from a given set to find the anomalies among them. The decision-maker observes a subset of the processes at any given time instant and obtains a... 详细信息
来源: 评论