版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Saga Univ Dept Adv Syst Control Engn Grad Sch Sci & Engn Saga 8408502 Japan
出 版 物:《SOFT COMPUTING》 (Soft Comput.)
年 卷 期:2005年第9卷第11期
页 面:835-845页
核心收录:
学科分类:08[工学] 0812[工学-计算机科学与技术(可授工学、理学学位)]
主 题:actor-critic algorithms nonholonomic mobile robot predictive model temporal difference learning tracking control problem
摘 要:In this paper, we propose two methods of adaptive actor-critic architectures to solve control problems of nonlinear systems. One method uses two actual states at time k and time k + 1 to update the learning algorithm. The basic idea of this method is that the agent can directly take some knowledge from the environment to improve its knowledge. The other method only uses the state at time k to update the algorithm. This method is called, learning from prediction (or simulated experience). Both methods include one or two predictive models, which are assumed to be applied to construct predictive states and a model-based actor (MBA). Here, the MBA as an actor can be viewed as a network where the connection weights are the elements of the feedback gain matrix. In the critic part, two value-functions are realized as a pure static mapping, which can be reduced to a nonlinear current estimator by using the radial basis function neural networks (RBFNNs). Simulation results obtained for a dynamical model of nonholonomic mobile robots with two independent driving wheels are presented. They show the effectiveness of the proposed approaches for the trajectory tracking control problem.