In this paper we present a real-time optimal control scheme of a Pendubot based on nonlinear model predictive control (NMPC) combined with nonlinear moving horizon estimation (NMHE). For the control of this fast, unde...
详细信息
ISBN:
(纸本)9781467379397
In this paper we present a real-time optimal control scheme of a Pendubot based on nonlinear model predictive control (NMPC) combined with nonlinear moving horizon estimation (NMHE). For the control of this fast, under-actuated nonlinear mechatronic system we utilize the ACADO Code Generation tool to obtain a highly efficient Gauss-Newton real-time iteration algorithm tailored for solving the underlying nonlinear optimization problems. To further improve the solvers' performance, we aim to parallelize particular algorithmic tasks within the estimation-control scheme. The overall control performance is experimentally verified by steering the Pendubot into its top unstable equilibrium. We also provide a computational efficiency analysis addressing different hardware/software configurations.
In traditional Adaptive Dynamic Programming (ADP), only one step estimate is considered for training process, Thus, learning efficiency is lower. If more steps estimates are included, learning process will be speed up...
详细信息
ISBN:
(纸本)9781424420780
In traditional Adaptive Dynamic Programming (ADP), only one step estimate is considered for training process, Thus, learning efficiency is lower. If more steps estimates are included, learning process will be speed up. Eligibility traces record the past and current gradients of estimation. It can be used to work with ADP for speeding up learning. In this paper, Heuristic Dynamic Programming (HDP) which is a typical structure of ADP is considered. An algorithm, HDP(A), integrating HDP with eligibility traces is presented. The algorithm is illustrated from both forward view and back view for clear comprehension. Equivalency of two views is analyzed. Furthermore, differences between HDP and HDP(A) are considered from both aspects of theoretic analysis and simulation results. The problem of balancing a pendulum robot (pendubot) is adopted as a benchmark. The results indicate that compared to HDP, HDP(A) shows higher convergence rate and training efficiency.
In traditional Adaptive Dynamic Programming (ADP), only one step estimate is considered for training process, Thus, learning efficiency is lower. If more steps estimates are included, learning process will be speed up...
详细信息
In traditional Adaptive Dynamic Programming (ADP), only one step estimate is considered for training process, Thus, learning efficiency is lower. If more steps estimates are included, learning process will be speed up. Eligibility traces record the past and current gradients of estimation. It can be used to work with ADP for speeding up learning. In this paper, Heuristic Dynamic Programming (HDP) which is a typical structure of ADP is considered. An algorithm, HDP(lambda), integrating HDP with eligibility traces is presented. The algorithm is illustrated from both forward view and back view for clear comprehension. Equivalency of two views is analyzed. Furthermore, differences between HDP and HDP(lambda) are considered from both aspects of theoretic analysis and simulation results. The problem of balancing a pendulum robot (pendubot) is adopted as a benchmark. The results indicate that compared to HDP, HDP(lambda) shows higher convergence rate and training efficiency.
暂无评论