The DC/DC Boost converter exhibits a non-minimum phase system with a right half-plane zero structure, posing significant challenges for the design of effective control approaches. This article presents the design of a...
详细信息
The DC/DC Boost converter exhibits a non-minimum phase system with a right half-plane zero structure, posing significant challenges for the design of effective control approaches. This article presents the design of a robust Proportional-Integral (PI) controller for this converter with an online adaptive mechanism based on the Reinforcement-Learning (RL) strategy. Classical PI controllers are simple and easy to build, but they need to be more robust against a wide range of disturbances and more adaptable to operational parameters. To address these issues, the RL adaptive strategy is used to optimize the performance of the PI controller. Some of the main advantages of the RL are lower sensitivity to error, more reliable results through the collection of data from the environment, an ideal model behavior within a specific context, and better frequency matching in real-time applications. Random exploration, nevertheless, can result in disastrous outcomes and surprising performance in real-world settings. Therefore, we opt for the Deterministic Policy Gradient (dpg) technique, which employs a deterministic action function as opposed to a stochastic one. dpg combines the benefits of actor-critics, deep Q-networks, and the deterministic policy gradient method. In addition, this method adopts the Snake Optimization (SO) algorithm to optimize the initial condition of gains, yielding more reliable results with faster dynamics. The SO method is known for its disciplined and nature-inspired approach, which results in faster decision-making and greater accuracy compared to other optimization algorithms. A structure using a hardware setup with CONTROLLINO MAXI Automation is built, which offers a more cost-effective and precise measurement method. Finally, the results achieved by simulations and experiments demonstrate the robustness of this approach.
The authors consider the task of learning control problem in reinforcement learning (RL) with continuous action space. Policy gradient, and in particular the determinist policy gradient (dpg) algorithm, provides a met...
详细信息
The authors consider the task of learning control problem in reinforcement learning (RL) with continuous action space. Policy gradient, and in particular the determinist policy gradient (dpg) algorithm, provides a method for solving learning control problem with continuous action space. However, when the RL task is complex enough so that tuning of the function approximation is necessary, hand-tuning for the features is infeasible. In order to solve this problem, the authors extend dpg algorithm by adding an approximate-linear-dependency-based sparsification procedure, which makes dpg algorithm to automatically select the useful and sparse features. As far as the authors know, this is the first time to consider the feature selection problem in dpg. Simulation results illustrate that (i) the proposed algorithm can find the optimal solution of the continuous version of mountain car problem;(ii) the proposed algorithm achieves good performance over a large range of the approximate linear dependency threshold parameter settings.
暂无评论