Thanks to recent explosive developments of data-drivenlearning methodologies, reinforcement learning (RL) emerges as a promising solution to address the legged locomotion problem in robotics. In this letter, we propo...
详细信息
Thanks to recent explosive developments of data-drivenlearning methodologies, reinforcement learning (RL) emerges as a promising solution to address the legged locomotion problem in robotics. In this letter, we propose CTS, a novel Concurrent Teacher-Student reinforcement learning architecture for legged locomotion over uneven terrains. Different from conventional teacher-student architecture that trains the teacher policy via RL first and then transfers the knowledge to the student policy through supervised learning, our proposed architecture trains teacher and student policy networks concurrently under the reinforcement learning paradigm. To this end, we develop a new training scheme based on a modified proximal policy gradient (PPO) method that exploits data samples collected from the interactions between both the teacher and the student policies with the environment. The effectiveness of the proposed architecture and the new training scheme is demonstrated through substantial quantitative simulation comparisons with the state-of-the-art approaches and extensive indoor and outdoor experiments with quadrupedal and point-foot bipedal robot platforms, showcasing robust and agile locomotion capability. Quantitative simulation comparisons show that our approach reduces the average velocity tracking error by up to 20% compared to the two-stage teacher-student, demonstrating significant superiority in addressing blind locomotion tasks.
The regulation problem for a class of discrete-time nonlinear non-affine systems with partially known structures using model free adaptive control (MFAC) algorithm is investigated in this paper. The core idea is to fi...
详细信息
ISBN:
(纸本)9798350354416;9798350354409
The regulation problem for a class of discrete-time nonlinear non-affine systems with partially known structures using model free adaptive control (MFAC) algorithm is investigated in this paper. The core idea is to first linearize the known parts of the system mathematical model based on traditional linearization methods, and then employ dynamic linearization technology to process the unknown structure of controlled system and the unmodeled dynamics generated by traditional linearization, for the purpose of complementary advantages and the collaborative control between the data-drivencontrol (DDC) methods and the model-based control (MBC) strategies. Unlike the prototype MFAC algorithm, the control scheme devised in this paper fully utilizes the known structure of the system such that the control objective can be better realized. Finally, the monotonic convergence of system tracking error is rigorously proved, meanwhile, the superiorities of developed algorithm is demonstrated by the simulation comparison results.
This study investigates the design of l2-l∞ filters for asynchronous discrete-time Singular nonhomogeneous Markov jump systems. Using a polytope set to characterize the time-varying transition probability. To describ...
详细信息
This paper presents a brief survey of deep reinforcement learning (DRL) for intersection navigation in autonomous driving. Intersection navigation poses significant challenges for autonomous driving (AD), considering ...
详细信息
This letter proposes a physics-informed action network (PIAN) for power system transient stability preventive control (TSPC). The network firstly renders deep learning to reduce the TSPC complexity. Unlike common data...
详细信息
This letter proposes a physics-informed action network (PIAN) for power system transient stability preventive control (TSPC). The network firstly renders deep learning to reduce the TSPC complexity. Unlike common data-driven methods that superficially imitate control experience, TSPC is then analytically embedded into the proposed PIAN network, so that to enforce the network to learn in-depth physical patterns. The well-learned PIAN enables highly generalized real-time decisions. Comparisons with one model-based and two data-driven baselines on the ieee 39-bus system and the ieee 145-bus system highlight that, the proposed method enables highly reliable control decisions, and beats the others in terms of decision efficiency and generalizability.
AI-driven music generation with emotional intelligence(EI) is a new discipline that blends emotional detection and artificial intelligence to create personalised musical experiences. The Artificial Intelligence(AI) sy...
详细信息
In this article, a model-free predictive control algorithm for the real-time system is presented. The algorithm is datadriven and is able to improve system performance based on multistep policy gradient reinforcement...
详细信息
In this article, a model-free predictive control algorithm for the real-time system is presented. The algorithm is datadriven and is able to improve system performance based on multistep policy gradient reinforcement learning. By learning from the offline dataset and real-time data, the knowledge of system dynamics is avoided in algorithm design and application. Cooperative games of the multiplayer in time horizon are presented to model the predictive control as optimization problems of multiagent and guarantee the optimality of the predictive control policy. In order to implement the algorithm, neural networks are used to approximate the action-state value function and predictive control policy, respectively. The weights are determined by using the methods of weighted residual. Numerical results show the effectiveness of the proposed algorithm.
In this article, we study the optimal iterative learningcontrol (ILC) for constrained systems with bounded uncertainties via a novel conic input mapping (CIM) design methodology. Due to the limited understanding of t...
详细信息
In this article, we study the optimal iterative learningcontrol (ILC) for constrained systems with bounded uncertainties via a novel conic input mapping (CIM) design methodology. Due to the limited understanding of the process of interest, modeling uncertainties are generally inevitable, significantly reducing the convergence rate of the controlsystems. However, huge amounts of measured process data interacting with model uncertainties can easily be collected. Incorporating these data into the optimal controller design could unlock new opportunities to reduce the error of the current trail optimization. Based on several existing optimal ILC methods, we incorporate the online process data into the optimal and robust optimal ILC design, respectively. Our methodology, called CIM, utilizes the process data for the first time by applying the convex cone theory and maps the data into the design of control inputs. CIM-based optimal ILC and robust optimal ILC methods are developed for uncertain systems to achieve better control performance and a faster convergence rate. Next, rigorous theoretical analyses for the two methods have been presented, respectively. Finally, two illustrative numerical examples are provided to validate our methods with improved performance.
This paper addresses the issue of heading control for an unmanned surface vehicle (USV) in the presence of un-certainties. We propose a model-free adaptive sliding-mode heading control method considering resource-effi...
详细信息
This paper concerns the trajectory tracking control of autonomous underwater vehicles (AUVs) at the kinematic and kinetic levels. Considering the accurate kinetic model of AUVs unavailable in practice, an adaptive con...
详细信息
暂无评论