检索结果-内蒙古大学图书馆

CTS: Concurrent Teacher-Student Reinforcement learning for Legged Locomotion

ieee ROBOTICS AND AUTOMATION LETTERS 2024年第11期9卷 9191-9198页

作者： Wang, Hongxi Luo, Haoxiang Zhang, Wei Chen, Hua Southern Univ Sci & Technol Sch Syst Design & Intelligent Mfg SDIM Shenzhen 518055 Peoples R China LimX Dynam Shenzhen 314400 Peoples R China Zhejiang Univ Univ Illinois Urbana Champaign Inst Haining Peoples R China

Thanks to recent explosive developments of data-driven learning methodologies, reinforcement learning (RL) emerges as a promising solution to address the legged locomotion problem in robotics. In this letter, we propose CTS, a novel Concurrent Teacher-Student reinforcement learning architecture for legged locomotion over uneven terrains. Different from conventional teacher-student architecture that trains the teacher policy via RL first and then transfers the knowledge to the student policy through supervised learning, our proposed architecture trains teacher and student policy networks concurrently under the reinforcement learning paradigm. To this end, we develop a new training scheme based on a modified proximal policy gradient (PPO) method that exploits data samples collected from the interactions between both the teacher and the student policies with the environment. The effectiveness of the proposed architecture and the new training scheme is demonstrated through substantial quantitative simulation comparisons with the state-of-the-art approaches and extensive indoor and outdoor experiments with quadrupedal and point-foot bipedal robot platforms, showcasing robust and agile locomotion capability. Quantitative simulation comparisons show that our approach reduces the average velocity tracking error by up to 20% compared to the two-stage teacher-student, demonstrating significant superiority in addressing blind locomotion tasks.

关键词： Legged locomotion Robots Propioception Training Reinforcement learning Quadrupedal robots Trajectory Legged robots machine learning for robot control reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Model-Free Adaptive control for Discrete-Time Nonlinear systems with Partially Known Structures 18

Model-Free Adaptive Control for Discrete-Time Nonlinear Syst...

引用

ieee 18th International conference on control and Automation (ICCA)

作者： Li, Fanghui Hou, Zhongsheng Qingdao Univ Inst Syst Sci & Control Qingdao 266071 Peoples R China

ISBN: (纸本)9798350354416;9798350354409

The regulation problem for a class of discrete-time nonlinear non-affine systems with partially known structures using model free adaptive control (MFAC) algorithm is investigated in this paper. The core idea is to first linearize the known parts of the system mathematical model based on traditional linearization methods, and then employ dynamic linearization technology to process the unknown structure of controlled system and the unmodeled dynamics generated by traditional linearization, for the purpose of complementary advantages and the collaborative control between the data-driven control (DDC) methods and the model-based control (MBC) strategies. Unlike the prototype MFAC algorithm, the control scheme devised in this paper fully utilizes the known structure of the system such that the control objective can be better realized. Finally, the monotonic convergence of system tracking error is rigorously proved, meanwhile, the superiorities of developed algorithm is demonstrated by the simulation comparison results.

关键词： data-driven control (DDC) model-free adaptive control (MFAC) nonlinear non-affine systems

来源：评论

学校读者我要写书评

暂无评论

Asynchronous l2-l∞ Filtering for Discrete-Time Singular Nonhomogeneous Markov Jump systems 13

Asynchronous l2-l∞ Filtering for Discrete-Time Singular Non...

引用

13th ieee data driven control and learning systems conference, DDCLS 2024

作者： Wu, Chenxin Hua, Mingang College of Artificial Intelligence and Automation Hohai University Changzhou213200 China

ISBN: (纸本)9798350361674

This study investigates the design of l2-l∞ filters for asynchronous discrete-time Singular nonhomogeneous Markov jump systems. Using a polytope set to characterize the time-varying transition probability. To describe the asynchronous phe-nomenon between system modes and filter modes, hidden Markov models are proposed. Based on the selected Lyapunov function, sufficient conditions are provided to ensure that the filtering error system is regular, causal and stochastically stable with a given l2-l∞ performance index. By solving the proposed conditions, an expected filter that satisfies prescribed l2-l∞ performance index can be constructed. In the end, the effectiveness and potential of this method were verified through two illustrative examples. © 2024 ieee.

关键词： Hidden Markov models

来源：评论

学校读者我要写书评

暂无评论

A Brief Survey of Deep Reinforcement learning for Intersection Navigation of Autonomous Vehicles 13

A Brief Survey of Deep Reinforcement Learning for Intersecti...

引用

13th ieee data driven control and learning systems conference, DDCLS 2024

作者： Liu, Yuqi Zhang, Qichao Zhao, Dongbin Institute of Automation Chinese Academy of Sciences Beijing100190 China School of Artificial Intelligence University of Chinese Academy of Sciences Beijing100049 China

ISBN: (纸本)9798350361674

This paper presents a brief survey of deep reinforcement learning (DRL) for intersection navigation in autonomous driving. Intersection navigation poses significant challenges for autonomous driving (AD), considering the dynamic environment, high complexity, and safety requirements. DRL has emerged as a powerful learning framework in the field of AD, enabling agents to learn interactive driving policies in complex intersection scenarios. The paper begins by outlining the architecture of a general AD system. It then elaborates on the concepts of DRL. Additionally, the paper discusses emerging technologies such as social attention and graph attention mechanisms, which enhance the performance of DRL-based AD driving policies. Finally, the challenges and future prospects are also given. Challenges include complex intersection scenarios, generalization across different layouts, and the lack of standardized benchmarks and evaluation metrics. Overcoming these challenges and leveraging future advancements can lead to improved safety and efficiency in road traffic. © 2024 ieee.

关键词： Deep reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

A Physics-Informed Action Network for Transient Stability Preventive control

引用

ieee TRANSACTIONS ON POWER systems 2023年第2期38卷 1771-1774页

作者： Liu, Youbo Gao, Shuyu Qiu, Gao Liu, Tingjian Ding, Lijie Liu, Junyong Sichuan Univ Coll Elect Engn Chengdu 610065 Peoples R China State Grid Sichuan Elect Power Res Inst Chengdu 610065 Peoples R China

This letter proposes a physics-informed action network (PIAN) for power system transient stability preventive control (TSPC). The network firstly renders deep learning to reduce the TSPC complexity. Unlike common data-driven methods that superficially imitate control experience, TSPC is then analytically embedded into the proposed PIAN network, so that to enforce the network to learn in-depth physical patterns. The well-learned PIAN enables highly generalized real-time decisions. Comparisons with one model-based and two data-driven baselines on the ieee 39-bus system and the ieee 145-bus system highlight that, the proposed method enables highly reliable control decisions, and beats the others in terms of decision efficiency and generalizability.

关键词： Power system stability Transient analysis Training Thermal stability Mathematical models Stability criteria Costs Deep learning physics-informed action network transient stability preventive control

来源：评论

学校读者我要写书评

暂无评论

Enigma Sound : AI driven Music Generation with Emotional Intelligence 3

Enigma Sound : AI driven Music Generation with Emotional Int...

引用

3rd International conference on Communication, control, and Intelligent systems, CCIS 2024

作者： Patil, Apurva Surendran, Surya Ranade, Neeta Mumbai India

ISBN: (纸本)9798331528201

AI-driven music generation with emotional intelligence(EI) is a new discipline that blends emotional detection and artificial intelligence to create personalised musical experiences. The Artificial Intelligence(AI) system can analyze data from various sources such as user inputs, physiological signals, facial expressions, among many others,in order to create music in resonance with an individual's emotional state. This project aims at developing an AI system that enables the design and generation of music which recognizes and responds to the human emotions through incorporation of emotional intelligence. The system will recognize the user's emotional state and produce music that responds to their needs by using face identification, text analysis, and audio processing. These systems use natural language processing and deep learning models to comprehend emotions and generate music. © 2024 ieee.

关键词： Deep learning

来源：评论

学校读者我要写书评

暂无评论

data-Based Predictive control via Multistep Policy Gradient Reinforcement learning

引用

ieee TRANSACTIONS ON CYBERNETICS 2023年第5期53卷 2818-2828页

作者： Yang, Xindi Zhang, Hao Wang, Zhuping Yan, Huaicheng Zhang, Changzhu Tongji Univ Shanghai Inst Intelligent Sci & Technol Shanghai 201804 Peoples R China Tongji Univ Dept Control Sci & Engn Shanghai 201804 Peoples R China East China Univ Sci & Technol Sch Informat Sci & Informat Shanghai 200237 Peoples R China

In this article, a model-free predictive control algorithm for the real-time system is presented. The algorithm is data driven and is able to improve system performance based on multistep policy gradient reinforcement learning. By learning from the offline dataset and real-time data, the knowledge of system dynamics is avoided in algorithm design and application. Cooperative games of the multiplayer in time horizon are presented to model the predictive control as optimization problems of multiagent and guarantee the optimality of the predictive control policy. In order to implement the algorithm, neural networks are used to approximate the action-state value function and predictive control policy, respectively. The weights are determined by using the methods of weighted residual. Numerical results show the effectiveness of the proposed algorithm.

关键词： Predictive control Real-time systems Predictive models Prediction algorithms Reinforcement learning Games Cost function Cooperative games multistep reinforcement learning (RL) policy gradient methods predictive control

来源：评论

学校读者我要写书评

暂无评论

Conic Input Mapping Design of Constrained Optimal Iterative learning controller for Uncertain systems

引用

ieee TRANSACTIONS ON CYBERNETICS 2023年第3期53卷 1843-1855页

作者： Zhou, Yuanqiang Gao, Kaihua Tang, Xiaopeng Hu, Huanjia Li, Dewei Gao, Furong Hong Kong Univ Sci & Technol Dept Chem & Biol Engn Hong Kong Peoples R China Shanghai Jiao Tong Univ Dept Automat Shanghai 200240 Peoples R China Shanghai Jiao Tong Univ Minist Educ China Key Lab Syst Control & Informat Proc Shanghai 200240 Peoples R China Guangzhou HKUST Fok Ying Tung Res Inst Guangzhou 511458 Peoples R China

In this article, we study the optimal iterative learning control (ILC) for constrained systems with bounded uncertainties via a novel conic input mapping (CIM) design methodology. Due to the limited understanding of the process of interest, modeling uncertainties are generally inevitable, significantly reducing the convergence rate of the control systems. However, huge amounts of measured process data interacting with model uncertainties can easily be collected. Incorporating these data into the optimal controller design could unlock new opportunities to reduce the error of the current trail optimization. Based on several existing optimal ILC methods, we incorporate the online process data into the optimal and robust optimal ILC design, respectively. Our methodology, called CIM, utilizes the process data for the first time by applying the convex cone theory and maps the data into the design of control inputs. CIM-based optimal ILC and robust optimal ILC methods are developed for uncertain systems to achieve better control performance and a faster convergence rate. Next, rigorous theoretical analyses for the two methods have been presented, respectively. Finally, two illustrative numerical examples are provided to validate our methods with improved performance.

关键词： Optimization Uncertainty Uncertain systems Convergence control systems Design methodology Symmetric matrices data-driven approach iterative learning control (ILC) optimization process control robust design

来源：评论

学校读者我要写书评

暂无评论

Resource-efficient Model-free Adaptive Sliding-mode Heading control for Unmanned Surface Vehicle with data Dropouts 13

Resource-efficient Model-free Adaptive Sliding-mode Heading ...

引用

13th ieee data driven control and learning systems conference, DDCLS 2024

作者： Chen, Yutong Zhao, Huarong Xu, Dezhi Yu, Hongnian Dan, Chenshi School of Internet of Things Engineering Jiangnan University Jiangsu Wuxi China School of Electrical Engineering Southeast University China School of Computing Engineering and the Built Environment Edinburgh Napier University EdinburghEHI0 5DT United Kingdom

ISBN: (纸本)9798350361674

This paper addresses the issue of heading control for an unmanned surface vehicle (USV) in the presence of un-certainties. We propose a model-free adaptive sliding-mode heading control method considering resource-efficient control and dual-channel data dropouts. First, a dynamic linearization model with a pseudo partial derivative is constructed, the value of the pseudo partial derivative is estimated, and a model-free sliding-mode heading controller is formulated for USV. Second, an event-triggered mechanism allows successful triggering only when the event trigger threshold is reached, thus significantly saving communication resources. Furthermore, a comprehensive compensatory strategy is devised to solve the data loss issues in both the forward and feedback channels. Ultimately, the convergence of heading control error of USV governed by the proposed method is rigorously proved, and the effectiveness of the proposed approach is verified through simulations. © 2024 ieee.

关键词： Unmanned surface vehicles

来源：评论

学校读者我要写书评

暂无评论

Adaptive control for Autonomous Underwater Vehicles without Kinetic Model using Deep Neural Network 13

Adaptive Control for Autonomous Underwater Vehicles without ...

引用

13th ieee data driven control and learning systems conference, DDCLS 2024

作者： Jiang, Shihui Zhang, Tianbo Fang, Jing Liu, Xing Shen, Dong Cssc Systems Engineering Research Institute Beijing China Department of Automation Beijing100084 China School of Mathematics Renmin University of China Beijing100872 China

ISBN: (纸本)9798350361674

This paper concerns the trajectory tracking control of autonomous underwater vehicles (AUVs) at the kinematic and kinetic levels. Considering the accurate kinetic model of AUVs unavailable in practice, an adaptive control using deep neural network (ACDNN) method is first proposed for 4-DOF AUVs. Specifically, only the inertial matrix information, which is the most accessible, is utilized to achieve tracking tasks while the Coriolis and centripetal force matrix and damping coefficient matrix are unknown. The unknown matrices are approximated and learned by the deep neural network. Compared with conventional adaptive neural network control, which is usually based on radial basis functions, the proposed method learns unknown kinetic features by the weights of nonlinear compositions of weighted features. It provides a good way to tackle the complex unknown AUV dynamics in complex marine environments. In addition, a separate time-scaled strategy is adopted to ensure online computing. The uniformly ultimate stability is achieved by Lyapunov analysis. Simulation results verify the effectiveness of the proposed method. © 2024 ieee.

关键词： Kinetic parameters

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：