This paper is concerned with a novel generalized policy iteration algorithm for solving optimal control problems for discrete-time nonlinear systems. The idea is to use an iterative adaptive dynamic programming algori...
详细信息
This paper is concerned with a novel generalized policy iteration algorithm for solving optimal control problems for discrete-time nonlinear systems. The idea is to use an iterative adaptive dynamic programming algorithm to obtain iterative control laws which make the iterative value functions converge to the optimum. Initialized by an admissible control law, it is shown that the iterative value functions are monotonically nonincreasing and converge to the optimal solution of Hamilton-Jacobi-Bellman equation, under the assumption that a perfect function approximation is employed. The admissibility property is analyzed, which shows that any of the iterative control laws can stabilize the nonlinear system. Neural networks are utilized to implement the generalized policy iteration algorithm, by approximating the iterative value function and computing the iterative control law, respectively, to achieve approximate optimal control. Finally, numerical examples are presented to verify the effectiveness of the present generalized policy iteration algorithm.
This paper presents a novel vision-based initial weld point positioning method for the welding systems of container manufacture. The new method is based on the geometric relationship between the two seams at the two d...
详细信息
This paper presents a novel vision-based initial weld point positioning method for the welding systems of container manufacture. The new method is based on the geometric relationship between the two seams at the two different stages of the whole welding task such as the initialization stage and the welding stage. The torch is aligned with the initial weld point manually at the first stage, and the image feature and the parameters of the seam line are computed. At the second stage, the target image feature of the seam line is firstly computed using the geometric relationship, then the alignment of the torch is automated based on the difference between the target and the current image features. The geometric relationship between the two seams is analyzed, and then the realization of the new method including the image processing, the computation of the parameters of the seam line, and the control system design is given in detail. Finally, experiments are well conducted to prove the effectiveness of the proposed initial weld point positioning method.
In this paper, we establish a new data-based iterative optimal learning control scheme for discrete-time nonlinear systems using iterative adaptive dynamic programming (ADP) approach and apply the developed control sc...
详细信息
In this paper, we establish a new data-based iterative optimal learning control scheme for discrete-time nonlinear systems using iterative adaptive dynamic programming (ADP) approach and apply the developed control scheme to solve a coal gasification optimal tracking control problem. According to the system data, neural networks (NNs) are used to construct the dynamics of coal gasification process, coal quality and reference control, respectively, where the mathematical model of the system is unnecessary. The approximation errors from neural network construction of the disturbance and the controls are both considered. Via system transformation, the optimal tracking control problem with approximation errors and disturbances is effectively transformed into a two-person zero-sum optimal control problem. A new iterative ADP algorithm is then developed to obtain the optimal control laws for the transformed system. Convergence property is developed to guarantee that the performance index function converges to a finite neighborhood of the optimal performance index function, and the convergence criterion is also obtained. Finally, numerical results are given to illustrate the performance of the present method.
Scene text detection could be formulated as a bi-label (text and non-text regions) segmentation problem. However, due to the high degree of intraclass variation of scene characters as well as the limited number of tra...
详细信息
Scene text detection could be formulated as a bi-label (text and non-text regions) segmentation problem. However, due to the high degree of intraclass variation of scene characters as well as the limited number of training samples, single information source or classifier is not enough to segment text from non-text background. Thus, in this paper, we propose a novel scene text detection approach using graph model built upon Maximally Stable Extremal Regions (MSERs) to incorporate various information sources into one framework. Concretely, after detecting MSERs in the original image, an irregular graph whose nodes are MSERs, is constructed to label MSERs as text regions or non-text ones. Carefully designed features contribute to the unary potential to assess the individual penalties for labeling a MSER node as text or non-text, and color and geometric features are used to define the pairwise potential to punish the likely discontinuities. By minimizing the cost function via graph cut algorithm, different information carried by the cost function could be optimally balanced to get the final MSERs labeling result. The proposed method is naturally context-relevant and scale-insensitive. Experimental results on the ICDAR 2011 competition dataset show that the proposed approach outperforms state-of-the-art methods both in recall and precision. (C) 2012 Elsevier B.V. All rights reserved.
In this paper, a novel data-driven stable iterative adaptive dynamic programming (ADP) algorithm is developed to solve optimal temperature control problems for water-gas shift (WGS) reaction systems. According to the ...
详细信息
In this paper, a novel data-driven stable iterative adaptive dynamic programming (ADP) algorithm is developed to solve optimal temperature control problems for water-gas shift (WGS) reaction systems. According to the system data, neural networks (NNs) are used to construct the dynamics of the WGS system and solve the reference control, respectively, where the mathematical model of the WGS system is unnecessary. Considering the reconstruction errors of NNs and the disturbances of the system and control input, a new stable iterative ADP algorithm is developed to obtain the optimal control law. The convergence property is developed to guarantee that the iterative performance index function converges to a finite neighborhood of the optimal performance index function. The stability property is developed to guarantee that each of the iterative control laws can make the tracking error uniformly ultimately bounded (UUB). NNs are developed to implement the stable iterative ADP algorithm. Finally, numerical results are given to illustrate the effectiveness of the developed method.
In this paper, a novel iterative Q-learning method called "dual iterative Q-learning algorithm" is developed to solve the optimal battery management and control problem in smart residential environments. In ...
详细信息
In this paper, a novel iterative Q-learning method called "dual iterative Q-learning algorithm" is developed to solve the optimal battery management and control problem in smart residential environments. In the developed algorithm, two iterations are introduced, which are internal and external iterations, where internal iteration minimizes the total cost of power loads in each period, and the external iteration makes the iterative Q-function converge to the optimum. Based on the dual iterative Q-learning algorithm, the convergence property of the iterative Q-learning method for the optimal battery management and control problem is proven for the first time, which guarantees that both the iterative Q-function and the iterative control law reach the optimum. Implementing the algorithm by neural networks, numerical results and comparisons are given to illustrate the performance of the developed algorithm.
In this paper, a novel iterative adaptive dynamic programming (ADP)-based infinite horizon self-learning optimal control algorithm, called generalized policy iteration algorithm, is developed for nonaffine discrete-ti...
详细信息
In this paper, a novel iterative adaptive dynamic programming (ADP)-based infinite horizon self-learning optimal control algorithm, called generalized policy iteration algorithm, is developed for nonaffine discrete-time (DT) nonlinear systems. Generalized policy iteration algorithm is a general idea of interacting policy and value iteration algorithms of ADP. The developed generalized policy iteration algorithm permits an arbitrary positive semidefinite function to initialize the algorithm, where two iteration indices are used for policy improvement and policy evaluation, respectively. It is the first time that the convergence, admissibility, and optimality properties of the generalized policy iteration algorithm for DT nonlinear systems are analyzed. Neural networks are used to implement the developed algorithm. Finally, numerical examples are presented to illustrate the performance of the developed algorithm.
Multicamera systems have many advantages and are widely used. However, many situations require camera parameters that are more accurate than those that are currently available. A new algorithm is proposed to improve t...
详细信息
Multicamera systems have many advantages and are widely used. However, many situations require camera parameters that are more accurate than those that are currently available. A new algorithm is proposed to improve the accuracy and consistency of these systems by adjusting the camera parameters. The algorithm assumes that the distribution of the measured point positions follows the Gaussian mixture model. Based on this model, point positions in space are estimated, and new camera parameters are computed from the estimation. A metric is defined to describe the difference between the newly computed and precalibrated camera parameters, following which the parameters are adjusted by minimizing this difference. Finally, the validity of the algorithm is confirmed by conducting experiments. Two indicators that describe the accuracy and consistency are defined and applied to analyze the experimental data. (C) 2015 Society of Photo-Optical Instrumentation Engineers (SPIE)
In this study, an online adaptive optimal control scheme is developed for solving the infinite-horizon optimal control problem of uncertain non-linear continuous-time systems with the control policy having saturation ...
详细信息
In this study, an online adaptive optimal control scheme is developed for solving the infinite-horizon optimal control problem of uncertain non-linear continuous-time systems with the control policy having saturation constraints. A novel identifier-critic architecture is presented to approximate the Hamilton-Jacobi-Bellman equation using two neural networks (NNs): an identifier NN is used to estimate the uncertain system dynamics and a critic NN is utilised to derive the optimal control instead of typical action-critic dual networks employed in reinforcement learning. Based on the developed architecture, the identifier NN and the critic NN are tuned simultaneously. Meanwhile, unlike initial stabilising control indispensable in policy iteration, there is no special requirement imposed on the initial control. Moreover, by using Lyapunov's direct method, the weights of the identifier NN and the critic NN are guaranteed to be uniformly ultimately bounded, while keeping the closed-loop system stable. Finally, an example is provided to demonstrate the effectiveness of the present approach.
This paper addresses the novel design of an underwater manipulator with a lightweight multilink structure and its free-floating autonomous operation. The concept design reduces the coupling between the manipulator and...
详细信息
This paper addresses the novel design of an underwater manipulator with a lightweight multilink structure and its free-floating autonomous operation. The concept design reduces the coupling between the manipulator and the vehicle efficiently, even in the case where the vehicle weight in air is not significantly greater than the manipulator weight. The specific implementation of the mechanical structure is elaborated. Moreover, a closed-loop control system based on binocular vision is proposed for underwater manipulation. In the end, experimental results demonstrate that the conceived underwater manipulator can accomplish the autonomous operation quickly.
暂无评论