版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education) Institute of Automation Jiangnan University Wuxi 214122 China Department of Mechanical Engineering Politecnico di Milano Milan 20156 Italy
出 版 物:《Chaos, Solitons & Fractals》 (Chaos Solitons Fractals)
年 卷 期:2025年第199卷
基 金:This work was supported in part by the National Natural Science Foundation of China under Grant 61991402 and Grant 62073154 in part by the China Scholarship Council China under Grant 202306790017 in part by the Postgraduate Research & Practice Innovation Program of Jiangsu Province under Grant KYCX22_2304 in part by the Horizon Europe program of under the Marie Sklodowska-Curie under Grant 101073037 in part by the Italian Ministry of University and Research under Grant P2022EXP2W
主 题:Stochastic systems
摘 要:This article discusses the adaptive identifier–critic–actor neural optimal control for stochastic nonstrict-feedback nonlinear systems with elastic state constraints. Reinforcement learning is used to achieve optimal control, which is designed based on the identifier–critic–actor structure of neural network approximation. In this framework, the identifier, critic and actor are used to estimate unknown dynamics, evaluate system performance and execute control actions, respectively. This control scheme designs the actual control from all virtual controls and dynamic surface controls as the optimal solution to the corresponding subsystems. The update law is derived through the negative gradient of a simple positive function, which is generated by the partial derivative of the Hamilton–Jacobi-Bellman (HJB) equation. At the same time, this design can also alleviate the requirement for continuous excitation conditions in current optimal control methods. A key innovation lies in formulating an elastic constraint function with flexible capabilities, thus providing a unified framework capable of flexibly addressing custom time constraints without changing the control structure. Stability analysis shows that all signals are semi-globally uniformly ultimately bounded in probability.