In this study, a data-driven learning algorithm was developed to estimate the optimal distributed cooperative control policy, which solves the cooperative optimal output regulation problem for linear discretetime mult...
详细信息
In this study, a data-driven learning algorithm was developed to estimate the optimal distributed cooperative control policy, which solves the cooperative optimal output regulation problem for linear discretetime multi-agent systems. Notably, the dynamics of all the agent systems and exo-system is completely unknown. By combining adaptive dynamic programming with an internalmodel, a model-free off-policy learning method is proposed to estimate the optimal control gain and the distributed adaptive internal model by only accessing the measurable data of multi-agent systems. Moreover, different from the traditional cooperative adaptive controller design method, a distributedinternalmodel is approximated online. Convergence and stability analyses show that the estimate controller generated by the proposed data-driven learning algorithm converges to the optimal distributed controller. Finally, simulation results verify the effectiveness of the proposed method.
In this article, a data-driven distributed control method is proposed to solve the cooperative optimal output regulation problem of leader-follower multiagent systems. Different from traditional studies on cooperative...
详细信息
In this article, a data-driven distributed control method is proposed to solve the cooperative optimal output regulation problem of leader-follower multiagent systems. Different from traditional studies on cooperative output regulation, a distributed adaptive internal model is originally developed, which includes a distributedinternalmodel and a distributed observer to estimate the leader's dynamics. Without relying on the dynamics of multiagent systems, we have proposed two reinforcement learning algorithms, policy iteration and value iteration, to learn the optimal controller through online input and state data, and estimated values of the leader's state. By combining these methods, we have established a basis for connecting data-distributed control methods with adaptive dynamic programming approaches in general since these are the theoretical foundation from which they are built.
暂无评论